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As authorized by the inventor (s), transmitted herewith for 
filing is a patent application applied for on behalf of the 
inventor (s) according to the provisions of 37 C.F.R, § 1, 41(c), 
which claims priority under 35 U.S.C. § 119(e) of Provisional 
Application No. 60/139,763 filed on June 18, 1999 

Inventor (s): Nickolai ALEXANDROV, Maxim TROUKHAN 

For: SEQUENCE-DETERMINED DNA FRAGMENTS AND CORRESPONDING 

POLYPEPTIDES ENCODED THEREBY 



Enclosed are: 

[3 A specification consisting of a description (8 66 pages) , 
Table 1 (725 pages). Claims (5 pages). Schematic 1 (1 page), 
and Abstract (1 page) totaling one thousand five hundred 
ninety-eight (1598) pages 

□ ( ) sheet (s) of formal drawings 

□ Certified copy of Priority Document (s) 

13 Executed Declaration in accordance with 37 C.F.R. § 1.64 will 
follow 

13 A statement to establish small entity status under 37 C.F.R. 
§ 1.9 and 37 C.F.R. § 1.27 
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□ Preliminary 7\mendinent 
^ Information Sheet 

□ Information Disclosure Statement, PTO-1449 and reference (s) 

□ Amend the specification by inserting before the first line 
the sentence: 

— This application claims priority on provisional Application 
No. filed on , the entire contents of which are 

hereby incorporated by reference. — 

Other: Power of Attorney regarding Small Entity Statement, 
ATCC deposit receipts PTA-595, PTA-1161, PTA-1411, CD 
containing specification. 
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□ CLAIMS PRESENTED 


+ $260.00 


+ $130.00 




TOTAL 
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$693.00 



13 The application transmitted herewith is filed in accordance 
with 37 C.F.R. § 1.41(c). The undersigned has been authorized 
by the inventor (s) to file the present application. The 
original duly executed declaration together with the 
surcharge will be forwarded in due course. 

13 A check in the amount of $693.00 to cover the filing fee is 
enclosed. 
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BIRCH, STEWART, KOLASCH & BIRCH, LLP or Customer No. 22 92 
P.O. Box 747 

Falls Church, VA 22040-0747 
Telephone: (703) 205-8000 



If necessary, the Commissioner is hereby authorized in this, 
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overpayment to Deposit Account No. 02-24 4 8 for any additional fees 
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of time fees. 
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Title: SEQUENCE-DETERMINED DNA FRAGMENTS AND CORRESPONDING POLYPEPTIDES 
ENCODED THEREBY 



I hereby state that I am 

Q the owner of the small business concern identified below: 
^ an official of the small business concern empowered to act on behalf of 
the concern identified below: 

NAME OF SMALL BUSINESS CONCERN CERES, INC. 

ADDRESS OF SMALL BUSINESS CONCERN 3007 Malibu Canyon Road Malibu, CA 90265 

I hereby state that the above identified small business concern qualifies as a small 
business concern as defined in 37 CFR Part 121 for purposes of paying reduced fees to the United 
States Patent and Trademark Office, in that the number of employees of the concern, including 
those of its affiliates, does not exceed 500 persons. For purposes of this statement, (1) the 
number of employees of the business concern is the average over the previous fiscal year of the 
concern of the persons employed on a full-time, part-time, or temporary basis during each of the 
pay periods of the fiscal year, and (2) concerns are affiliates of each other when either, 
directly or indirectly, one concern controls or has the power to control the other, or a third 
party or parties controls or has the power to control both. 

I hereby state that rights under contract or law have been conveyed to and remain with 
the small business concern identified above with regard to the invention described in: 

^ the specification filed herewith with title as listed above. 
r~l the application identified above, 
n the patent identified above. 

If the rights held by the above identified small business concern are not 
exclusive, each individual, concern, or organization having rights in the invention 
must file separate statements as to their status as small entities, and no rights to 
the invention are held by any person, other than the inventor, who would not qualify 
as an independent inventor under 37 CFR 1.9(c) if that person made the invention, or 
by any concern which would not qualify as a small business concern under 37 CFR 
1.9(d), or a nonprofit organization under 37 CFR 1.9(e). 

Each person, concern, or organization having any rights in the 
invention is listed below: 

^ no such person, concern, or organization exists. 

n each such person, concern, or organization is listed below. 

Separate statements are required from each named person, concern, or 
organization having rights to the invention stating their status as small entities. 
(37 CFR 1.27) 

I acknowledge the duty to file, in this application or patent, notification of 
any change in status resulting in loss of entitlement to small entity status prior to 
paying, or at the time of paying, the earliest of the issue fee or any maintenance fee 
due after the date on which status as a small entity is not longer appropriate. (37 
CFR 1.28 (b) ) 

NAME OF PERSON SIGNING Mark J. Nuell (Reg. No. 36, 623) 
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FIELD OF THE INVENTION 

The present invention relates to isolated polynucleotides that represent a complete 
gene, or a fragment thereof, that is expressed. In addition, the present invention relates to the 
polypeptide or protein corresponding to the coding sequence of these polynucleotides. The 
1 0 present invention also relates to isolated polynucleotides that represent regulatory regions of 
genes. The present invention also relates to isolated polynucleotides that represent 
untranslated regions of genes. The present invention further relates to the use of these isolated 
polynucleotides and polypeptides and proteins. 

DESCRIPTION OF THE RELATED ART 

1 5 Efforts to map and sequence the genome of a number of organisms are in progress; a few 

complete genome sequences, for example those of £. coli and Saccharomyces cerevisiae are 
known (Blattner et al.. Science 277:1453 (1997); Goffeau et al.. Science 274:546 (1996)). The 
complete genome of a multicellular organism, C. elegans, has also been sequenced (See, the C. 
elegans Sequencing Consortium, Science 282 :2012 (1998)). To date, no complete genome of a 

2 0 plant has been sequenced, nor has a complete cDNA complement of any plant been sequenced. 

SUMMARY OF THE INVENTION 

The present invention comprises polynucleotides, such as complete cDNA sequences 
and/or sequences of genomic DNA encompassing complete genes, fragments of genes, and/or 

2 5 regulatory elements of genes and/or regions with other functions and/or intergenic regions, 

hereinafter collectively referred to as Sequence-Determined DNA Fragments (SDFs), from 
different plant species, particularly corn, wheat, soybean, rice and Arabidopsis thallana, and 
other plants and or mutants, variants, fragments or fusions of said SDFs and polypeptides or 
proteins derived therefrom. In some instances, the SDFs span the entirety of a protein-coding 

3 0 segment. In some instances, the entirety of an mRNA is represented. Other objects of the 
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invention that are also represented by SDFs of the invention are control sequences, such as, but 
not limited to, promoters. Complements of any sequence of the invention are also considered 
part of the invention. 

Other objects of the invention are polynucleotides comprising exon sequences, 
5 polynucleotides comprising intron sequences, polynucleotides comprising introns together with 
exons, intron/exon junction sequences, 5' untranslated sequences, and 3' untranslated sequences 
of the SDFs of the present invention. Polynucleotides representing the joinder of any exons 
described herein, in any arrangement, for example, to produce a sequence encoding any 
desirable amino acid sequence are within the scope of the invention. 

1 0 The present invention also resides in probes useful for isolating and identifying nucleic 

acids that hybridize to an SDF of the invention. The probes can be of any length, but more 
typically are 12-2000 nucleotides in length; more typically, 15 to 200 nucleotides long; even 
more typically, 18 to 100 nucleotides long. 

Yet another object of the invention is a method of isolating and/or identifying nucleic 

1 5 acids using the following steps: 

(a) contacting a probe of the instant invention with a polynucleotide sample under 
conditions that permit hybridization and formation of a polynucleotide duplex; and 

(b) detecting and/or isolating the duplex of step (a). 

The conditions for hybridization can be from low to moderate to high stringency 
2 0 conditions. The sample can include a polynucleotide having a sequence unique in a plant 

genome. Probes and methods of the invention are useful, for example, without limitation, for 
mapping of genetic traits and/or for positional cloning of a desired fragment of genomic DNA. 

Probes and methods of the invention can also be used for detecting alternatively spliced 
messages within a species. Probes and methods of the invention can further be used to detect or 

2 5 isolate related genes in other plant species using genomic DNA (gDNA) and/or cDNA libraries. 

In some instances, especially when longer probes and low to moderate stringency hybridization 
conditions are used; the probe will hybridize to a plurality of cDNA and/or gDNA sequences of 
a plant. This approach is useful for isolating representatives of gene families which are 
identifiable by possession of a common functional domain in the gene product or which have 

3 0 common cis-acting regulatory sequences. This approach is also useful for identifying 

orthologous genes from other organisms. 

The present invention also resides in constructs for modulating the expression of the 
genes comprised of all or a fragment of an SDF. The constructs comprise all or a fragment of 
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the expressed SDF, or of a complementary sequence. Examples of constructs include 
ribozymes comprising RNA encoded by an SDF or by a sequence complementary thereto, 
antisense constructs, constructs comprising coding regions or parts thereof, constructs 
comprising promoters, introns, untranslated regions, scaffold attachment regions, methylating 
5 regions, enhancing or reducing regions, DNA and chromatin conformation modifying 

sequences, etc. Such constructs can be constructed using viral, plasmid, bacterial artificial 
chromosomes (BACs), plasmid artificial chromosomes (PACs), autonomous plant plasmids, 
plant artificial chromosomes or other types of vectors and exist in the plant as autonomous 
replicating sequences or as DNA integrated into the genome. When inserted into a host cell 

1 0 the construct is, preferably, functionally integrated with, or operatively linked to, a 

heterologous polynucleotide. For instance, a coding region from an SDF might be operably 
linked to a promoter that is functional in a plant. 

The present invention also resides in host cells, including bacterial or yeast cells or plant 
cells, and plants that harbor constructs such as described above. Another aspect of the invention 

1 5 relates to methods for modulating expression of specific genes in plants by expression of the 

coding sequence of the constructs, by regulation of expression of one or more endogenous genes 
in a plant or by suppression of expression of the polynucleotides of the invention in a plant. 
Methods of modulation of gene expression include without limitation (1) inserting into a host 
cell additional copies of a polynucleotide comprising a coding sequence; (2) modulating an 

2 0 endogenous promoter in a host cell; (3) inserting antisense or ribozyme constructs into a host 

cell and (4) inserting into a host cell a polynucleotide comprising a sequence encoding a variant 
, fragment, or fusion of the native polypeptides of the instant invention. 

BRIEF DESCRIPTION OF THE TABLES 

2 5 In TABLE 1, the format of the data is as follows: 

In Table 1, sequence data are presented in the form of annotation of a reference 
sequence. The format is shown below. The reference sequence is shown at the top of the 
annotation file as a 7 digit sequence number preceded by ">" (e.g. >5019261). The sequence 
identifier is a "gi" number that identifies a specific DNA sequence in the publically 

30 accessible BLAST Databases on the NCBI FTP web site (accessible at ncbi.nlm.gov/blast). 
In particular, the "nt.Z" nucleotide sequence data base at the NCBI FTP site utilizes the "gi" 
identifiers to assign by NCBI a unique identifier for each sequence in the databases, thereby 
providing a non-redundant database for sequences from various data bases, including 
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GenBank, EMBL, DDBJ (DNA Database of Japan) and PDB (Brookhaven Protein Data 
Bank). Thus, the line in TABLE 1 beginning with sequence number identifies the unique "gi" 
identifier followed by the corresponding GenBank (gb) accession number and locus. The 
reference sequence number is followed on the next line by data regarding the length of the 
5 sequence ("len") and the number of exons found in the sequence by the analysis program 
("nex"). 

The annotation data are presented in columns; the leftmost column identifies the position of 
the putative exon in the gene as initial ("init"), internal ("intr") or terminal ("term"). Genes 
considered composed of a single exon are denoted "sngl". The next column describes the 

1 0 position in the nucleotide sequence beginning the exon ("start") and the next column 

describes the position in the nucleotide sequence ending the exon ("stop"). The direction of 
the gene is indicated in the next column, "+" indicating 5' - 3' in the direction presented in 
the database, "-" indicating the opposite orientation. The "gene number" is given in the final 
column. Exons having the same gene number are grouped in the order shown to create the 

1 5 relevant coding sequence. 
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>5019261 <= This is the gi number of the public sequence 
len = 97208 nex = 121 
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DETAILED DESCRIPTION OF THE INVENTION 

The invention relates to (I) polynucleotides and methods of use thereof, such as 

IA. Probes, Primers and Substrates; 

IB. Methods of Detection and Isolation; 
B.l. Hybridization; 

B.2. Methods of Mapping; 

B.3. Southern Blotting; 

B.4. Isolating cDNA from Related Organisms; 

B. 5. Isolating and/or Identifying Orthologous Genes 

IC. Methods of Inhibiting Gene Expression 

C. l. Antisense 

C.2. Ribozyme Constructs; 

C.3. Chimeraplasts; 

C.4 Co-Suppression; 

C.5. Transcriptional Silencing 

C.6. Other Methods to Inhibit Gene Expression 

ID. Methods of Functional Analysis; 

IE. Promoter Sequences and Their Use; 

IF. UTRs and/or Intron Sequences and Their Use; and 
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IG. Coding Sequences and Their Use. 



The invention also relates to (II) polypeptides and proteins and methods of use thereof, 
such as IIA. Native Polypeptides and Proteins 
5 A.1 Antibodies 

A. 2 In Vitro Applications 

IIB. Polypeptide Variants, Fragments and Fusions 

B. l Variants 
B.2 Fragments 



The invention also includes (III) methods of modulating polypeptide production, such as 
IIIA. Suppression 

A.l Antisense 
15 A.2 Ribozymes 

A. 3 Co-suppression 

A.4 Insertion of Sequences into the Gene to be Modulated 
A. 5 Promoter Modulation 

A. 6 Expression of Genes containing Dominant-Negative Mutations 
2 0 IIIB. Enhanced Expression 

B. l Insertion of an Exogenous Gene 
B.2 Promoter Modulation 



The invention further concerns (IV) gene constructs and vector construction, such as 
2 5 rVA. Coding Sequences 

IVB. Promoters 
IVC. Signal Peptides 



The invention still further relates to 
30 V Transformation Techniques 
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Definitions 



Allelic variant An "allelic variant" is an alternative form of the same SDF, which 

resides at the same chromosomal locus in the organism. Allelic variations can occur in any 
5 portion of the gene sequence, including regulatory regions. Allelic variants can arise by 
normal genetic variation in a population. Allelic variants can also be produced by genetic 
engineering methods. An allelic variant can be one that is found in a naturally occurring 
plant, including a cultivar or ecotype. An allelic variant may or may not give rise to a 
phenotypic change, and may or may not be expressed. An allele can result in a detectable 
1 0 change in the phenotype of the trait represented by the locus. A phenotypically silent allele 
can give rise to a product. 

Alternatively spliced messages Within the context of the current invention, 
"alternatively spliced messages" refers to mature mRNAs originating from a single gene with 
1 5 variations in the number and/or identity of exons, introns and/or intron-exon junctions. 

Chimeric The term "chimeric" is used to describe genes, as defined supra, or contructs 
wherein at least two of the elements of the gene or construct, such as the promoter and the 
coding sequence and/or other regulatory sequences and/or filler sequences and/or complements 
2 0 thereof, are heterologous to each other. 

Constitutive Promoter: Promoters referred to herein as "constitutive promoters" actively promote 
transcription under most, but not necessarily all, environmental conditions and states of 
development or cell differentiation. Examples of constitutive promoters include the cauliflower 

2 5 mosaic vims (CaMV) 35S transcript initiation region and the 1' or 2' promoter derived from 

T-DNA of Agrobacterium tumefaciens, and other transcription initiation regions from Vcirious 
plant genes, such as the maize ubiquitin-1 promoter, known to those of skill. 

Coordinately Expressed: The term "coordinately expressed," as used in the current 

3 0 invention, refers to genes that are expressed at the same or a similar time and/or stage and/or 

under the same or similar environmental conditions. 
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Domain: Domains are fingerprints or signatures that can be used to characterize 

protein families and/or parts of proteins. Such fingerprints or signatures can comprise 
conserved (1) primary sequence, (2) secondary structure, and/or (3) three-dimensional 
5 conformation. Generally, each domain has been associated with either a family of proteins or 
motifs. Typically, these families and/or motifs have been correlated with specific in-vitro 
and/or in-vivo activities. A domain can be any length, including the entirety of the sequence 
of a protein. Detailed descriptions of the domains, associated families and motifs, and 
correlated activities of the polypeptides of the instant invention are described below. 
1 0 Usually, the polypeptides with designated domain(s) can exhibit at least one activity that is 
exhibited by any polypeptide that comprises the same domain(s). 

Endogenous The term "endogenous," within the context of the current invention refers to 
1 5 any polynucleotide, polypeptide or protein sequence which is a natural part of a cell or 
organisms regenerated from said cell. 

Exogenous "Exogenous," as referred to within, is any polynucleotide, polypeptide 

or protein sequence, whether chimeric or not, that is initially or subsequently introduced into 
2 0 the genome of an individual host cell or the organism regenerated from said host cell by any 
means other than by a sexual cross. Examples of means by which this can be accomplished 
are described below, and include Agrobacterium-mQdidXQ^. transformation (of dicots - e.g. 
Salomon et al. EMBOJ. 3:141 (1984); Herrera-Estrella et al. EMBOJ. 2:987 (1983); of 
monocots, representative papers are those by Escudero et al.. Plant J. 10:355 (1996), Ishida et 

2 5 al, Nature Biotechnology 14:745 (1996), May et al.. Bio I Technology 13:486 (1995)), biolistic 

methods (Armaleo et al.. Current Genetics 17:97 1990)), electroporation, in planta 
techniques, and the like. Such a plant containing the exogenous nucleic acid is referred to 
here as a To for the primary transgenic plant and Ti for the first generation. The term 
"exogenous" as used herein is also intended to encompass inserting a naturally found element 

3 0 into a non-naturally found location. 



Filler sequence: As used herein, "filler sequence" refers to any nucleotide sequence that 

is inserted into DNA construct to evoke a particular spacing between particular components 
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such as a promoter and a coding region and may provide an additional attribute such as a 
restriction enzyme site. 

Gene: The term "gene," as used in the context of the current invention, encompasses all 
5 regulatory and coding sequence contiguously associated with a single hereditary unit with a 
genetic function (see SCHEMATIC 1). Genes can include non-coding sequences that 
modulate the genetic function that include, but are not limited to, those that specify 
polyadenylation, transcriptional regulation, DNA conformation, chromatin conformation, 
extent and position of base methylation and binding sites of proteins that control all of these. 

1 0 Genes comprised of "exons" (coding sequences), which may be interrupted by "introns" 

(non-coding sequences), encode proteins. A gene's genetic function may require only RNA 
expression or protein production, or may only require binding of proteins and/or nucleic acids 
without associated expression. In certain cases, genes adjacent to one another may share 
sequence in such a way that one gene will overlap the other. A gene can be found within the 

15 genome of an organism, artificial chromosome, plasmid, vector, etc., or as a separate isolated 
entity. 

Gene Family: "Gene family" is used in the current invention to describe a group of 
functionally related genes, each of which encodes a separate protein. 

20 

Heterologous sequences: "Heterologous sequences" are those that are not operatively 
linked or are not contiguous to each other in nature. For example, a promoter from corn is 
considered heterologous to an Arabidopsis coding region sequence. Also, a promoter from a 
gene encoding a growth factor from corn is considered heterologous to a sequence encoding the 

25 corn receptor for the growth factor. Regulatory element sequences, such as UTRs or 3' end 

termination sequences that do not originate in nature from the same gene as the coding sequence 
originates from, are considered heterologous to said coding sequence. Elements operatively 
linked in nature and contiguous to each other are not heterologous to each other. On the other 
hand, these same elements remain operatively linked but become heterologous if other filler 

3 0 sequence is placed between them. Thus, the promoter and coding sequences of a corn gene 

expressing an amino acid transporter are not heterologous to each other, but the promoter and 
coding sequence of a corn gene operatively linked in a novel manner are heterologous. 
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Homologous gene In the current invention, "homologous gene" refers to a gene that shares 
sequence similarity with the gene of interest. This similarity may be in only a fragment of the 
sequence and often represents a functional domain such as, examples including without 
limitation a DNA binding domain, a domain with tyrosine kinase activity, or the like. The 
5 functional activities of homologous genes are not necessarily the same. 

Inducible Promoter An "inducible promoter" in the context of the current invention 

refers to a promoter which is regulated under certain conditions, such as light, chemical 
concentration, protein concentration, conditions in an organism, cell, or organelle, etc. A typical 

1 0 example of an inducible promoter, which can be utilized with the polynucleotides of the present 
invention, is PARSKl, the promoter from the Arabidopsis gene encoding a serine-threonine 
kinase enzyme, and which promoter is induced by dehydration, abscissic acid and sodium 
chloride (Wang and Goodman, Plant J. 8:37 (1995)) Examples of environmental conditions that 
may affect transcription by inducible promoters include anaerobic conditions, elevated 

1 5 temperature, or the presence of light. 

Intergenic region "Intergenic region," as used in the current invention, refers to 
nucleotide sequence occurring in the genome that separates adjacent genes. 

20 

Mutant gene In the current invention, "mutant" refers to a heritable change in DNA 
sequence at a specific location. Mutants of the current invention may or may not have an 
associated identifiable function when the mutant gene is transcribed. 

2 5 Orthologous Gene In the current invention "orthologous gene" refers to a second gene that 

encodes a gene product that performs a similar function as the product of a first gene. The 
orthologous gene may also have a degree of sequence similarity to the first gene. The 
orthologous gene may encode a polypeptide that exhibits a degree of sequence similarity to a 
polypeptide corresponding to a first gene. The sequence similarity can be found within a 

3 0 functional domain or along the entire length of the coding sequence of the genes and/or their 

corresponding polypeptides. 
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Percentage of sequence identity "Percentage of sequence identity," as used herein, is 
determined by comparing two optimally aligned sequences over a comparison window, where 
the fragment of the polynucleotide or amino acid sequence in the comparison window may 
comprise additions or deletions (e.g., gaps or overhangs) as compared to the reference sequence 
5 (which does not comprise additions or deletions) for optimal alignment of the two sequences. 
The percentage is calculated by determining the number of positions at which the identical 
nucleic acid base or amino acid residue occurs in both sequences to yield the number of matched 
positions, dividing the number of matched positions by the total number of positions in the 
window of comparison and multiplying the result by 100 to yield the percentage of sequence 

1 0 identity. Optimal alignment of sequences for comparison may be conducted by the local 

homology algorithm of Smith and Waterman Add. APL. Math. 2:482 (1981), by the homology 
alignment algorithm of Needleman and Wunsch /. Mol. Biol. 48:443 (1970), by the search for 
similarity method of Pearson and Lipman Proc. Natl. Acad. Set (USA) 85: 2444 (1988), by 
computerized implementations of these algorithms (GAP, BESTFIT, BLAST, PASTA, and 

1 5 TFASTA in the Wisconsin Genetics Software Package, Genetics Computer Group (GCG), 575 
Science Dr., Madison, WI), or by inspection. Given that two sequences have been identified for 
comparison, GAP and BESTFIT are preferably employed to determine their optimal alignment. 
Typically, the default values of 5.00 for gap weight and 0.30 for gap weight length are used. 
The term "substantial sequence identity" between polynucleotide or polypeptide sequences 

2 0 refers to polynucleotide or polypeptide comprising a sequence that has at least 80% sequence 

identity, preferably at least 85%, more preferably at least 90% and most preferably at least 95%, 
even more preferably, at least 96%, 97%, 98% or 99% sequence identity compared to a 
reference sequence using the programs. 

2 5 Plant Promoter A "plant promoter" is a promoter capable of initiating transcription in 

plant cells and can drive or facilitate transcription of a fragment of the SDF of the instant 
invention or a coding sequence of the SDF of the instant invention. Such promoters need not 
be of plant origin. For example, promoters derived from plant viruses, such as the CaMV35S 
promoter or from Agrobacterium tumefaciens such as the T-DNA promoters, can be plant 

3 0 promoters. A typical example of a plant promoter of plant origin is the maize ubiquitin-1 (ubi- 

l)promoter known to those of skill. 
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Promoter: The term "promoter," as used herein, refers to a region of sequence 

determinants located upstream from the start of transcription of a gene and which are involved in 
recognition and binding of RNA polymerase and other proteins to initiate and modulate 
5 transcription. A basal promoter is the minimal sequence necessary for assembly of a 

transcription complex required for transcription initiation. Basal promoters frequently include a 
"TATA box" element usually located between 15 and 35 nucleotides upstream from the site of 
initiation of transcription. Basal promoters also sometimes include a "CCAAT box" element 
(typically a sequence CCAAT) and/or a GGGCG sequence, usually located between 40 and 200 
1 0 nucleotides, preferably 60 to 120 nucleotides, upstream from the start site of transcription. 

Public sequence: The term "public sequence ," as used in the context of the instant 
application, refers to any sequence that has been deposited in a publicly accessible database. 
This term encompasses both amino acid and nucleotide sequences. Such sequences are 

1 5 publicly accessible, for example, on the BLAST databases on the NCBI FTP web site 

(accessible at ncbi.nlm.gov/blast). The database at the NCBI GTP site utilizes "gi" numbers 
assigned by NCBI as a unique identifier for each sequence in the databases, thereby 
providing a non-redundant database for sequence from various databases, including 
GenBank, EMBL, DBBJ, (DNA Database of Japan) and PDB (Brookhaven Protein Data 

2 0 Bank). 

Regulatory Sequence The term "regulatory sequence," as used in the current 

invention, refers to any nucleotide sequence that influences transcription or translation 
initiation and rate, and stability and/or mobility of the transcript or polypeptide product. 

2 5 Regulatory sequences include, but are not limited to, promoters, promoter control elements, 

protein binding sequences, 5' and 3' UTRs, transcriptional start site, termination sequence, 
polyadenylation sequence, introns, certain sequences within a coding sequence, etc. 

Related Sequences: "Related sequences" refer to either a polypeptide or a nucleotide 

3 0 sequence that exhibits some degree of sequence similarity with a sequence described in Table 

1. 
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Scaffold Attachment Region (SAR) As used herein, "scaffold attachment region" is a DNA 
sequence that anchors chromatin to the nuclear matrix or scaffold to generate loop domains 
that can have either a transcriptionally active or inactive structure (Spiker and Thompson 
(1996) Plant Physiol. 110: 15-21). 

5 

Sequence-determined DNA fragments (SDFs) "Sequence-determined DNA fragments" 
as used in the current invention are isolated sequences of genes, fragments of genes, 
intergenic regions or contiguous DNA from plant genomic DNA or cDNA or RNA the 
sequence of which has been determined. 

10 

Signal Peptide A "signal peptide" as used in the current invention is an amino acid 

sequence that targets the protein for secretion, for transport to an intracellular compartment or 
organelle or for incorporation into a membrane. Signal peptides are indicated in the tables 
and a more detailed description located below. 

15 

Specific Promoter In the context of the current invention, "specific promoters" refers to a 
subset of inducible promoters that have a high preference for being induced in a specific 
tissue or cell and/or at a specific time during development of an organism. By "high 
preference" is meant at least 3-fold, preferably 5-fold, more preferably at least 10-fold still 
2 0 more preferably at least 20-fold, 50-fold or 100-fold increase in transcription in the desired 
tissue over the transcription in any other tissue. Typical examples of temporal and/or tissue 
specific promoters of plant origin that can be used with the polynucleotides of the present 
invention, are: PTA29, a promoter which is capable of driving gene transcription specifically in 
tapetum and only during anther development (Koltonow et al.. Plant Cell 2:1201 (1990); RCc2 

2 5 and RCc3, promoters that direct root-specific gene transcription in rice (Xu et al, Plant Mol. 

Biol. TJ:-^^! (1995); TobRB27, a root-specific promoter from tobacco (Yamamoto et al., Plant 
Cell 3:371 (1991)). Examples of tissue-specific promoters under developmental control include 
promoters that initiate transcription only in certain tissues or organs, such as root, ovule, fruit, 
seeds, or flowers. Other suitable promoters include those from genes encoding storage proteins 

3 0 or the lipid body membrane protein, oleosin. A few root-specific promoters are noted above. 

Stringency "Stringency" as used herein is a function of probe length, probe composition (G 
+ C content), and salt concentration, organic solvent concentration, and temperature of 
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hybridization or wash conditions. Stringency is typically compared by the parameter T^, which 
is the temperature at which 50% of the complementary molecules in the hybridization are 
hybridized, in terms of a temperature differential from Tm. High stringency conditions are those 
providing a condition of Tm - 5°C to Tm - lO'^C. Medium or moderate stringency conditions are 
5 those providing Tm - 20°C to T^ - 29°C. how stringency conditions are those providing a 

condition of Tm - 40°C to Tm - 48°C. The relationship of hybridization conditions to Tm (in "C) is 
expressed in the mathematical equation 

Tm = 81.5 -16.6(logio[Na^]) + 0.41(%G+C) - (600/N) (1) 

where N is the length of the probe. This equation works well for probes 14 to 70 nucleotides in 
1 0 length that are identical to the target sequence. The equation below for Tm of DNA-DNA 

hybrids is useful for probes in the range of 50 to greater than 500 nucleotides, and for conditions 
that include an organic solvent (formamide). 

Tm = 81.5+16.6 log {[Na^]/(l+0.7[Na^])}+ 0.41(%G+C)-500/L 0.63(%formamide) (2) 

where L is the length of the probe in the hybrid. (P. Tijessen, "Hybridization with Nucleic 
1 5 Acid Probes" in Laboratory Techniques in Biochemistry and Molecular Biology , P.C. vand 
der Vliet, ed., c. 1993 by Elsevier, Amsterdam.) The Tm of equation (2) is affected by the 
nature of the hybrid; for DNA-RNA hybrids Tm is 10-15°C higher than calculated, for RNA- 
RNA hybrids Tm is 20-25°C higher. Because the Tm decreases about 1 for each 1% 
decrease in homology when a long probe is used (Bonner et al., /. Mol. Biol. 81:123 (1973)), 
2 0 stringency conditions can be adjusted to favor detection of identical genes or related family 
members. 

Equation (2) is derived assuming equilibrium and therefore, hybridizations according 
to the present invention are most preferably performed under conditions of probe excess and 
for sufficient time to achieve equilibrium. The time required to reach equilibrium can be 
2 5 shortened by inclusion of a hybridization accelerator such as dextran sulfate or another high 
volume polymer in the hybridization buffer. 

Stringency can be controlled during the hybridization reaction or after hybridization 
has occurred by altering the salt and temperature conditions of the wash solutions used. The 
formulas shown above are equally valid when used to compute the stringency of a wash 
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solution. Preferred wash solution stringencies lie within the ranges stated above; high 
stringency is 5-8°C below Tm, medium or moderate stringency is 26-29°C below and low 
stringency is 45-48°C below Tm. 

5 Substantially free of A composition containing A is "substantially free of " B when 

at least 85% by weight of the total A+B in the composition is A. Preferably, A comprises at 
least about 90% by weight of the total of A+B in the composition, more preferably at least 
about 95% or even 99% by weight. For example, a plant gene or DNA sequence can be 
considered substantially free of other plant genes or DNA sequences. 

10 

Translational start site In the context of the current invention, a "translational start 

site" is usually an ATG in the cDNA transcript, more usually the first ATG. A single cDNA, 
however, may have multiple translational start sites. 

1 5 Transcription start site "Transcription start site" is used in the current invention to 

describe the point at which transcription is initiated. This point is typically located about 25 
nucleotides downstream from a TFIID binding site, such as a TATA box. Transcription can 
initiate at one or more sites within the gene, and a single gene may have multiple transcriptional 
start sites, some of which may be specific for transcription in a particular cell-type or tissue. 

20 

Untranslated region (UTR) A "UTR" is any contiguous series of nucleotide bases that is 
transcribed, but is not translated. These untranslated regions may be associated with 
particular functions such as increasing mRNA message stability. Examples of UTRs include, 
but are not limited to polyadenylation signals, terminations sequences, sequences located 

2 5 between the transcriptional start site and the first exon (5' UTR) and sequences located 

between the last exon and the end of the mRNA (3' UTR). 

Variant: The term "variant" is used herein to denote a polypeptide or protein or 
polynucleotide molecule that differs from others of its kind in some way. For example, 

3 0 polypeptide and protein variants can consist of changes in amino acid sequence and/or charge 

and/or post-translational modifications (such as glycosylation, etc). 
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DETAILED DESCRIPTION OF THE INVENTION 

I. Polynucleotides 

Exemplified SDFs of the invention represent fragments of the genome of corn, wheat, 
rice, soybean ox Arabidopsis and/or represent mRNA expressed from that genome. The isolated 
5 nucleic acid of the invention also encompasses corresponding fragments of the genome and/or 
cDNA complement of other organisms as described in detail below. 

Polynucleotides of the invention can be isolated from polynucleotide libraries using 
primers comprising sequence similar to those described by Table 1. See, for example, the 
methods described in Sambrook et al., supra. 
1 0 Alternatively, the polynucleotides of the invention can be produced by chemical 

synthesis. Such synthesis methods are described below. 

It is contemplated that the nucleotide sequences presented herein may contain some 
small percentage of errors. These errors may arise in the normal course of determination of 
nucleotide sequences. Sequence errors can be corrected by obtaining seeds deposited under the 
1 5 accession numbers cited herein, propagating them, isolating genomic DNA or appropriate 
mRNA from the resulting plants or seeds thereof, amplifying the relevant fragment of the 
genomic DNA or mRNA using primers having a sequence that flanks the erroneous sequence, 
and sequencing the amplification product. 

LA. Probes, Primers and Substrates 

2 0 SDFs of the invention can be applied to substrates for use in array applications such 

as, but not limited to, assays of global gene expression, for example under varying conditions 
of development, growth conditions. The arrays can also be used in diagnostic or forensic 
methods (WO95/35505, US 5,445,943 and US 5,410,270). 

Probes and primers of the instant invention will hybridize to a polynucleotide 

25 comprising a sequence in Table 1. Though many different nucleotide sequences can encode 
an amino acid sequence, the sequences of Table 1 are generally preferred for encoding 
polypeptides of the invention. However, the sequence of the probes and/or primers of the 
instant invention need not be identical to those in Table 1 or the complements thereof. For 
example, some variation in probe or primer sequence and/or length can allow additional 
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family members to be detected, as well as orthologous genes and more taxonomically distant 
related sequences. Similarly, probes and/or primers of the invention can include additional 
nucleotides that serve as a label for detecting the formed duplex or for subsequent cloning 
purposes. 

5 Probe length will vary depending on the application. For use as primers, probes are 

12-40 nucleotides, preferably 18-30 nucleotides long. For use in mapping, probes are 
preferably 50 to 500 nucleotides, preferably 100-250 nucleotides long. For Southern 
hybridizations, probes as long as several kilobases can be used as explained below. 

The probes and/or primers can be produced by synthetic procedures such as the 
1 0 triester method of Matteucci et al. J. Am. Chem. Soc. 103:3185( 1981); or according to Urdea 
et al. Proc. Natl. Acad. 80:7461 (1981) or using commercially available automated 
oligonucleotide synthesizers. 

15 l.B. Methods of Detection and Isolation 

The polynucleotides of the invention can be utilized in a number of methods known to 
those skilled in the art as probes and/or primers to isolate and detect polynucleotides, 
including, without limitation: Southerns, Northerns, Branched DNA hybridization assays, 
polymerase chain reaction, and microarray assays, and variations thereof. Specific methods 
2 0 given by way of examples, and discussed below include: 
Hybridization 
Methods of Mapping 
Southern Blotting 

Isolating cDNA from Related Organisms 

2 5 Isolating and/or Identifying Orthologous Genes. 

Also, the nucleic acid molecules of the invention can used in other methods, such as high 
density oligonucleotide hybridizing assays, described, for example, in U.S. Pat. Nos. 
6,004,753; 5,945,306; 5,945,287; 5,945,308; 5,919,686; 5,919,661; 5,919,627; 5,874,248; 
5,871,973; 5,871,971; and 5,871,930; and PCT Pub. Nos. WO 9946380; WO 9933981; WO 

3 0 9933870; WO 9931252; WO 9915658; WO 9906572; WO 9858052; WO 9958672; and WO 

9810858. 



B.l. Hybridization 
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The isolated SDFs of Table 1 of the present invention can be used as probes and/or 
primers for detection and/or isolation of related polynucleotide sequences through 
hybridization. Hybridization of one nucleic acid to another constitutes a physical property 
that defines the subject SDF of the invention and the identified related sequences. Also, such 
5 hybridization imposes structural limitations on the pair. A good general discussion of the 
factors for determining hybridization conditions is provided by Sambrook et al. ("Molecular 
Cloning, a Laboratory Manual, 2nd ed., c. 1989 by Cold Spring Harbor Laboratory Press, Cold 
Spring Harbor, NY; see esp., chapters 11 and 12). Additional considerations and details of the 
physical chemistry of hybridization are provided by G.H. Keller and M.M. Manak "DNA 

1 0 Probes", 2"'^ Ed. pp. 1-25, c. 1993 by Stockton Press, New York, NY. 

Depending on the stringency of the conditions under which these probes and/or primers 
are used, polynucleotides exhibiting a wide range of similarity to those in Table 1 can be 
detected or isolated. When the practitioner wishes to examine the result of membrane 
hybridizations under a variety of stringencies, an efficient way to do so is to perform the 

1 5 hybridization under a low stringency condition, then to wash the hybridization membrane 
under increasingly stringent conditions. 

When using SDFs to identify orthologous genes in other species, the practitioner will 
preferably adjust the amount of target DNA of each species so that, as nearly as is practical, 

2 0 the same number of genome equivalents are present for each species examined. This 

prevents faint signals from species having large genomes, and thus small numbers of genome 
equivalents per mass of DNA, from erroneously being interpreted as absence of the 
corresponding gene in the genome. 

The probes and/or primers of the instant invention can also be used to detect or isolate 
25 nucleotides that are "identical" to the probes or primers. Two nucleic acid sequences or 

polypeptides are said to be "identical" if the sequence of nucleotides or amino acid residues, 
respectively, in the two sequences is the same when aligned for maximum correspondence as 
described below. 

Isolated polynucleotides within the scope of the invention also include allelic variants of 

3 0 the specific sequences presented in Table 1 . The probes and/or primers of the invention can also 

be used to detect and/or isolate polynucleotides exhibiting at least 80% sequence identity with 
the sequences of Table 1 or fragments thereof. 
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With respect to nucleotide sequences, degeneracy of the genetic code provides the 
possibility to substitute at least one base of the base sequence of a gene with a different base 
without causing the amino acid sequence of the polypeptide produced from the gene to be 
changed. Hence, the DNA of the present invention may also have any base sequence that has 
5 been changed from a sequence in Table 1 by substitution in accordance with degeneracy of 
genetic code. References describing codon usage include: Carels et al., J. Mol. Evol. 46: 45 
(1998) and Fennoy et al.,Nucl. Acids Res. 21(231 : 5294 (1993). 

B.2. Mapping 

The isolated SDF DNA of the invention can be used to create various types of genetic 
1 0 and physical maps of the genome of corn, Arabidopsis, soybean, rice, wheat, or other plants. 
Some SDFs may be absolutely associated with particular phenotypic traits, allowing 
construction of gross genetic maps. While not all SDFs will immediately be associated with 
a phenotype, all SDFs can be used as probes for identifying polymorphisms associated with 
phenotypes of interest. Briefly, one method of mapping involves total DNA isolation from 
1 5 individuals. It is subsequently cleaved with one or more restriction enzymes, separated 

according to mass, transferred to a solid support, hybridized with SDF DNA and the pattern 
of fragments compared. Polymorphisms associated with a particular SDF are visualized as 
differences in the size of fragments produced between individual DNA samples after 
digestion with a particular restriction enzyme and hybridization with the SDF. After 
2 0 identification of polymorphic SDF sequences, linkage studies can be conducted. By using the 
individuals showing polymorphisms as parents in crossing programs, F2 progeny 
recombinants or recombinant inbreds, for example, are then analyzed. The order of DNA 
polymorphisms along the chromosomes can be determined based on the frequency with 
which they are inherited together versus independently. The closer two polymorphisms are 

2 5 together in a chromosome the higher the probability that they are inherited together. 

Integration of the relative positions of all the polymorphisms and associated marker SDFs can 
produce a genetic map of the species, where the distances between markers reflect the 
recombination frequencies in that chromosome segment. 

The use of recombinant inbred lines for such genetic mapping is described for 

3 0 Arabidopsis by Alonso-Blanco et al. {Methods in Molecular Biology, vol.82, "Arabidopsis 

Protocols'', pp. 137-146, J.M. Martinez -Zap ater and J. Salinas, eds., c. 1998 by Humana 
Press, Totowa, NJ) and for corn by Burr ("Mapping Genes with Recombinant Inbreds", pp. 
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249-254. In Freeling, M. and V. Walbot (Ed.), The Maize Handbook, c. 1994 by Springer- 
Verlag New York, Inc.: New York, NY, USA; Berlin Germany; Burr et al. Genetics (1998) 
118: 519; Gardiner, J. et al., (1993) Genetics 134: 917). This procedure, however, is not 
limited to plants and can be used for other organisms (such as yeast) or for individual cells. 
5 The SDFs of the present invention can also be used for simple sequence repeat (SSR) 

mapping. Rice SSR mapping is described by Morgante et al. {The Plant Journal (1993) 3: 
165), Panaud et al. {Genome (1995) 38: 1170); Senior et al. {Crop Science (1996) 36: 1676), 
Taramino et al. {Genome (1996) 39: 277) and Ahn et al. {Molecular and General Genetics 
(1993) 241: 483-90). SSR mapping can be achieved using various methods. In one instance, 

1 0 polymorphisms are identified when sequence specific probes contained within an SDF 

flanking an SSR are made and used in polymerase chain reaction (PGR) assays with template 
DNA from two or more individuals of interest. Here, a change in the number of tandem 
repeats between the SSR-flanking sequences produces differently sized fragments (U.S. 
Patent 5,766,847). Alternatively, polymorphisms can be identified by using the PGR 

1 5 fragment produced from the SSR-flanking sequence specific primer reaction as a probe 
against Southern blots representing different individuals (U.H. Refseth et al., (1997) 
Electrophoresis 18: 1519). 

Genetic and physical maps of crop species have many uses. For example, these maps 
can be used to devise positional cloning strategies for isolating novel genes from the mapped 

2 0 crop species. In addition, because the genomes of closely related species are largely syntenic 
(that is, they display the same ordering of genes within the genome), these maps can be used 
to isolate novel alleles from relatives of crop species by positional cloning strategies. 

The various types of maps discussed above can be used with the SDFs of the 
invention to identify Quantitative Trait Loci (QTLs). Many important crop traits, such as the 

2 5 solids content of tomatoes, are quantitative traits and result from the combined interactions of 

several genes. These genes reside at different loci in the genome, oftentimes on different 
chromosomes, and generally exhibit multiple alleles at each locus. The SDFs of the 
invention can be used to identify QTLs and isolate specific alleles as described by de Vicente 
and Tanksley {Genetics 134 :585 (1993)). In addition to isolating QTL alleles in present crop 

3 0 species, the SDFs of the invention can also be used to isolate alleles from the corresponding 

QTL of wild relatives. Transgenic plants having various combinations of QTL alleles can 
then be created and the effects of the combinations measured. Once a desired allele 
combination has been identified, crop improvement can be accomplished either through 



Reference No. 2750-942P 

22 

biotechnological means or by directed conventional breeding programs (for review see 
Tanksley and McCouch, Science 277:1063 (1997)). 

In another embodiment, the SDFs can be used to help create physical maps of the 
genome of corn, Arabidopsis and related species. Where SDFs have been ordered on a 
5 genetic map, as described above, they can be used as probes to discover which clones in 

large libraries of plant DNA fragments in YACs, BACs, etc. contain the same SDF or similar 
sequences, thereby facilitating the assignment of the large DNA fragments to chromosomal 
positions. Subsequently, the large BACs, YACs, etc. can be ordered unambiguously by more 
detailed studies of their sequence composition (e.g. Marra et al. (1997) Genomic Research 

1 0 7:1072-1084) and by using their end or other sequences to find the identical sequences in 

other cloned DNA fragments. The overlapping of DNA sequences in this way allows large 
contigs of plant sequences to be built that, when sufficiently extended, provide a complete 
physical map of a chromosome. Sometimes the SDFs themselves will provide the means of 
joining cloned sequences into a contig. 

1 5 The patent publication WO95/35505 and U.S. Patents 5,445,943 and 5,410,270 

describe scanning multiple alleles of a plurality of loci using hybridization to arrays of 
oligonucleotides. These techniques are useful for each of the types of mapping discussed 
above. 

Following the procedures described above and using a plurality of the SDFs of 
2 0 the present invention, any individual can be genotyped. These individual genotypes can be 
used for the identification of particular cultivars, varieties, lines, ecotypes and genetically 
modified plants or can serve as tools for subsequent genetic studies involving multiple 
phenotypic traits. 

2 5 B.3 Southern Blot Hybridization 

The sequences from Table 1 can be used as probes for various hybridization 
techniques. These techniques are useful for detecting target polynucleotides in a sample or 
for determining whether transgenic plants, seeds or host cells harbor a gene or sequence of 
interest and thus might be expected to exhibit a particular trait or phenotype. 

3 0 In addition, the SDFs from the invention can be used to isolate additional members of 

gene families from the same or different species and/or orthologous genes from the same or 
different species. This is accomplished by hybridizing an SDF to, for example, a Southern 
blot containing the appropriate genomic DNA or cDNA. Given the resulting hybridization 
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data, one of ordinary skill in the art could distinguish and isolate the correct DNA fragments 
by size, restriction sites, sequence and stated hybridization conditions from a gel or from a 
library. 

Identification and isolation of orthologous genes from closely related species and 
5 alleles within a species is particularly desirable because of their potential for crop 

improvement. Many important crop traits, such as the solid content of tomatoes, result from 
the combined interactions of the products of several genes residing at different loci in the 
genome. Generally, alleles at each of these loci can make quantitative differences to the trait. 
By identifying and isolating numerous alleles for each locus from within or different species, 
1 0 transgenic plants with various combinations of alleles can be created and the effects of the 

combinations measured. Once a more favorable allele combination has been identified, crop 
improvement can be accomplished either through biotechnological means or by directed 
conventional breeding programs (Tanksley et al. Science 277 :1063(1997)). 

1 5 The results from hybridizations of the SDFs of the invention to, for example. 

Southern blots containing DNA from another species can also be used to generate restriction 
fragment maps for the corresponding genomic regions. These maps provide additional 
information about the relative positions of restriction sites within fragments, further 
distinguishing mapped DNA from the remainder of the genome. 

2 0 Physical maps can be made by digesting genomic DNA with different combinations 

of restriction enzymes. 

Probes for Southern blotting to distinguish individual restriction fragments can range 
in size from 15 to 20 nucleotides to several thousand nucleotides. More preferably, the probe 
is 100 to 1,000 nucleotides long for identifying members of a gene family when it is found 

25 that repetitive sequences would complicate the hybridization. For identifying an entire 

corresponding gene in another species, the probe is more preferably the length of the gene, 
typically 2,000 to 10,000 nucleotides, but probes 50-1,000 nucleotides long might be used. 
Some genes, however, might require probes up to 1,500 nucleotides long or overlapping 
probes constituting the full-length sequence to span their lengths. 

30 Also, while it is preferred that the probe be homogeneous with respect to its sequence, 

it is not necessary. For example, as described below, a probe representing members of a gene 
family having diverse sequences can be generated using PGR to amplify genomic DNA or 
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RNA templates using primers derived from SDFs that include sequences that define the gene 
family. 

For identifying corresponding genes in another species, the next most preferable 
probe is a cDNA spanning the entire coding sequence, which allows all of the mRNA-coding 
fragment of the gene to be identified. Probes for Southern blotting can easily be generated 
from SDFs by making primers having the sequence at the ends of the SDF and using corn or 
Arabidopsis genomic DNA as a template. In instances where the SDF includes sequence 
conserved among species, primers including the conserved sequence can be used for PGR 
with genomic DNA from a species of interest to obtain a probe. 

Similarly, if the SDF includes a domain of interest, that fragment of the SDF can be used to 
make primers and, with appropriate template DNA, used to make a probe to identify genes 
containing the domain. Alternatively, the PGR products can be resolved, for example by gel 
electrophoresis, and cloned and/or sequenced. Using Southern hybridization, the variants of 
the domain among members of a gene family, both within and across species, can be 
examined. 

B.4.1 Isolating DNA from Related Organisms 

The SDFs of the invention can be used to isolate the corresponding DNA from other 
organisms. Either cDNA or genomic DNA can be isolated. For isolating genomic DNA, a 
lambda, cosmid, BAG or YAG, or other large insert genomic library from the plant of interest 
can be constructed using standard molecular biology techniques as described in detail by 
Sambrook et al. 1989 (Molecular Gloning: A Laboratory Manual, ed. Gold Spring Harbor 
Laboratory Press, New York) and by Ausubel et al. 1992 (Gurrent Protocols in Molecular 
Biology, Greene Publishing, New York). 

To screen a phage library, for example, recombinant lambda clones are plated out on 
appropriate bacterial medium using an appropriate E. coli host strain. The resulting plaques 
are lifted from the plates using nylon or nitrocellulose filters. The plaque lifts are processed 
through denaturation, neutralization, and washing treatments following the standard protocols 
outlined by Ausubel et al. (1992). The plaque lifts are hybridized to either radioactively 
labeled or non-radioactively labeled SDF DNA at room temperature for about 16 hours, 
usually in the presence of 50% formamide and 5X SSG (sodium chloride and sodium citrate) 
buffer and blocking reagents. The plaque lifts are then washed at 42°G with 1% Sodium 
Dodecyl Sulfate (SDS) and at a particular concentration of SSG. The SSG concentration used 
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is dependent upon the stringency at which hybridization occurred in the initial Southern blot 
analysis performed. For example, if a fragment hybridized under medium stringency (e.g., 
Tm - 20°C), then this condition is maintained or preferably adjusted to a less stringent 
condition (e.g., Tm-30°C) to wash the plaque lifts. Positive clones show detectable 
5 hybridization e.g., by exposure to X-ray films or chromogen formation. The positive clones 
are then subsequently isolated for purification using the same general protocol outlined 
above. Once the clone is purified, restriction analysis can be conducted to narrow the region 
corresponding to the gene of interest. The restriction analysis and succeeding subcloning 
steps can be done using procedures described by, for example Sambrook at al. (1989) cited 
1 0 above. 

The procedures outlined for the lambda library are essentially similar to those used for 
YAC library screening, except that the YAC clones are harbored in bacterial colonies. The 
YAC clones are plated out at reasonable density on nitrocellulose or nylon filters supported 
by appropriate bacterial medium in petri plates. Following the growth of the bacterial clones, 
1 5 the filters are processed through the denaturation, neutralization, and washing steps following 
the procedures of Ausubel et al. 1992. The same hybridization procedures for lambda library 
screening are followed. 

To isolate cDNA, similar procedures using appropriately modified vectors are 
employed. For instance, the library can be constructed in a lambda vector appropriate for 
cloning cDNA such as >^gtll. Alternatively, the cDNA library can be made in a plasmid 
vector. cDNA for cloning can be prepared by any of the methods known in the art, but is 
preferably prepared as described above. Preferably, a cDNA library will include a high 
proportion of full-length clones. 

B. 5. Isolating and/or Identifying Orthologous Genes 
Probes and primers of the invention can be used to identify and/or isolate 
2 0 polynucleotides related to those in Table 1. Related polynucleotides are those that are native to 
other plant organisms and exhibit either similar sequence or encode polypeptides with similar 
biological activity. One specific example is an orthologous gene. Orthologous genes have the 
same functional activity. As such, orthologous genes may be distinguished from homologous 
genes. The percentage of identity is a function of evolutionary separation and, in closely related 
2 5 species, the percentage of identity can be 98 to 100%. The amino acid sequence of a protein 

encoded by an orthologous gene can be less than 75% identical, but tends to be at least75% or at 
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least 80% identical, more preferably at least 90%, most preferably at least 95% identical to the 
amino acid sequence of the reference protein. 

To find orthologous genes, the probes are hybridized to nucleic acids from a species of interest 
under low stringency conditions, preferably one where sequences containing as much as 40-45% 
mismatches will be able to hybridize. This condition is established by Tm - 40^0 to Tm - 48°C 
{see below). Blots are then washed under conditions of increasing stringency. It is preferable 
that the wash stringency be such that sequences that are 85 to 100% identical will hybridize. 
More preferably, sequences 90 to 100% identical will hybridize and most preferably only 
sequences greater than 95% identical will hybridize. One of ordinary skill in the art will 
recognize that, due to degeneracy in the genetic code, amino acid sequences that are identical 
can be encoded by DNA sequences as little as 67% identical or less. Thus, it is preferable, for 
example, to make an overlapping series of shorter probes, on the order of 24 to 45 nucleotides, 
and individually hybridize them to the same arrayed library to avoid the problem of degeneracy 
introducing large numbers of mismatches. 

As evolutionary divergence increases, genome sequences also tend to diverge. Thus, 
one of skill will recognize that searches for orthologous genes between more divergent 
species will require the use of lower stringency conditions compared to searches between 
closely related species. Also, degeneracy of the genetic code is more of a problem for 
searches in the genome of a species more distant evolutionarily from the species that is the 
source of the SDF probe sequences. 

Therefore the method described in Bouckaert et al., U.S. Ser. No. 60/121,700 Atty. 
Dkt. No. 2750-1 17P, Client Dkt. No. 00010.001, filed February 25, 1999, hereby 
incorporated in its entirety by reference, can be applied to the SDFs of the present invention 
to isolate related genes from plant species which do not hybridize to the corn Arabidopsis, 
soybean, rice, wheat, and other plant sequences of Table 1. 

Identification of the relationship of nucleotide or amino acid sequences among plant 
species can be done by comparing the nucleotide or amino acid sequences of SDFs of the 
present application with nucleotide or amino acid sequences of other SDFs such as those 
present in applications listed in the table below: 

[ Attorney Docket I gltent Pocket T Filing Bate I ><i»pHfeation I 
2750-0301 P 80002.001 9/4/1998 60/099,672 

2750-0300P 80001.001 9/4/1998 60/099,671 

2750-0302P 80003.001 9/11/1998 60/099,933 

2750-0304P 80004.001 9/17/1998 60/100,864 

2750-0305P 80005.001 9/18/1998 60/101,042 
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2750-0306P 


80006.001 


9/21/1 998 


60/101,255 


2750-0307P 


80007.001 


9/24/1998 


60/101,682 


2750-0308 P 


80008.001 


9/30/1998 


60/102,533 


2750-0309P 


80009.001 


9/30/1 998 


60/102,460 


2750-031 OP 


80010.001 


10/5/1 998 


60/103,116 


2750-031 1 P 


8001 1 .001 


10/5/1 998 


60/103,141 


2750-031 2P 


80012.001 


10/6/1 998 


60/103,215 


2750-031 3P 


80013.001 


10/8/1998 


60/1 03,554 


2750-031 4P 


80014.001 


10/9/1998 


60/103,574 


2750-031 5P 


80015.001 


1 0/1 3/1 998 


60/103,907 


2750-031 6P 


80016.001 


1 0/1 4/1 998 


60/104,268 


2750-031 7P 


80017.001 


10/16/1998 


60/104,680 


2750-031 8P 


80018.001 


10/19/1998 


60/104,828 


2750-031 9P 


80019.001 


1 0/20/1 998 


60/105,008 


2750-0320P 


80020.001 


10/21/1998 


60/1 05,142 


2750-0321 P 


80021 .001 


10/22/1998 


60/1 05,533 


2750-0322 P 


80022.001 


10/26/1998 


60/105,571 


2750-0323 P 


80023.001 


1 0/27/1 998 


60/105,815 


2750-0324P 


80024.001 


1 0/29/1 998 


60/106,105 


2750-0325 P 


80025.001 


1 0/30/1 998 


60/106,218 


2 750-0326 P 


80026.001 


1 1/2/1 998 


60/106,685 


2750-0327P 


80027.001 


11/6/1 998 


60/107,282 


2750-0329P 


80029.001 


11/9/1998 


60/107,719 


2750-0328 P 


80028.001 


1 1 /9/1 998 


60/107,720 


2750-0330 P 


80030.001 


1 1 /1 0/1 998 


60/107,836 


2750-0331 P 


80031 .001 


1 1 /1 2/1 998 


60/108,190 


2750-0332 P 


80032.001 


11/16/1998 


60/1 08,526 


2750-0333 P 


80033.001 


1 1/17/1998 


60/108,901 


2750-0335 P 


80035.001 


11/19/1 998 


60/109,127 


2750-0334P 


80034.001 


11/19/1 998 


60/109,124 


2750-0336P 


80036.001 


11/20/1998 


60/109,267 


2750-0337P 


80037.001 


1 1 /23/1 998 


60/109,594 


2750-0338P 


80038.001 


1 1 /25/1 998 


60/110,053 


2750-0339P 


80039.001 


1 1 /25/1 998 


60/1 10,050 


2750-0340P 


80040.001 


1 1/27/1998 


60/110,158 


2750-0341 P 


80041 .001 


1 1 /30/1 998 


60/1 10,263 


2750-0342P 


80042.001 


12/1/1998 


60/110,495 


2750-0343P 


80043.001 


1 2/2/1 998 


60/1 10,626 


2750-0344P 


80044.001 


1 2/3/1 998 


60/110,701 


2750-0345P 


80045.001 


12/7/1998 


60/1 1 1 ,339 


2750-0346P 


80046.001 


1 2/9/1 998 


60/1 1 1 ,589 


2750-0347P 


80047.001 


12/1 0/1998 


60/1 1 1 ,782 


2750-0348P 


80048.001 


12/11/1998 


60/111,812 


2750-0349P 


80049.001 


12/14/1998 


60/112,096 


2750-0350P 


80050.001 


12/15/1998 


60/112,224 


2750-0351 P 


80051.001 


12/16/1998 


60/112,624 


2750-0352P 


80052.001 


12/17/1998 


60/112,862 


2750-0353P 


80053.001 


12/18/1998 


60/112,912 


2750-0354P 


80054.001 


12/21/1998 


60/113,248 
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2750-0355P 


80055.001 


12/22/1998 


60/113,522 


2750-0356P 


80056.001 


12/23/1998 


60/113,826 


2750-0357P 


80057.001 


12/28/1998 


60/113,998 


2750-0358P 


80058.001 


12/29/1998 


60/114,384 


2750-0359P 


80059.001 


12/30/1998 


60/114,455 


2750-0360P 


80060.001 


1/4/1999 


60/114,740 


2750-0361 P 


80061.001 


1/6/1999 


60/114,866 


2750-0362P 


80062.001 


1/7/1999 


60/115,153 


2750-0367P 


80067.001 


1/7/1999 


60/115,154 


2750-0366P 


80066.001 


1/7/1999 


60/115,156 


2750-0365P 


80065.001 


1/7/1999 


60/115,155 


2750-0363P 


80063.001 


1/7/1999 


60/115,152 


2750-0364P 


80064.001 


1/7/1999 


60/115,151 


2750-0370P 


80070.001 


1/8/1999 


60/115,293 


2750-0369P 


80069.001 


1/8/1999 


60/115,365 


2750-0368P 


80068.001 


1/8/1999 


60/1 1 5,364 


2750-0371 P 


80071.001 


1/11/1999 


60/115,339 


2750-0372P 


80072.001 


1/12/1999 


60/115,518 


2750-0373P 


80073.001 


1/13/1999 


60/115.847 


2750-0374P 


80074.001 


1/14/1999 


60/115,905 


2750-0375P 


80075.001 


1/15/1999 


60/116,383 


2750-0376P 


80076.001 


1/15/1999 


60/116,384 


2750-0378P 


80078.001 


1/19/1999 


60/116,340 


2750-0377P 


80077.001 


1/19/1999 


60/116,329 


2750-0380P 


80080.001 


1/21/1999 


60/116,672 


2750-0379P 


80079.001 


1/21/1 999 


60/116,674 


2750-0381 P 


80081.001 


1/22/1999 


60/116,960 


2750-0382P 


80082.001 


1/22/1999 


60/116,962 


2750-0383 P 


80083.001 


1/28/1999 


60/117,756 


2750-0384P 


80084.001 


2/3/1999 


60/118,672 


2750-0385 P 


80085.001 


2/4/1999 


60/118,808 


2750-0386P 


80086.001 


2/5/1999 


60/118,778 


2750-0387P 


80087.001 


2/8/1999 


60/119,029 


2750-0388P 


80088.001 


2/9/1 999 


60/119,332 


2750-0389P 


80089.001 


2/10/1999 


60/119,462 


2750-0391 P 


80091.001 


2/12/1999 


60/119,922 


2750-0392P 


80092.001 


2/16/1999 


60/120,196 


2750-0393P 


80093.001 


2/16/1999 


60/120,198 


2750-0394P 


80094.001 


2/1 8/1 999 


60/120,583 


2750-0395P 


80095.001 


2/22/1999 


60/121,072 


2750-0396P 


80096.001 


2/23/1999 


60/121,334 


2750-0397 P 


80097.001 


2/24/1 999 


60/121,470 


2750-0390P 


80090.001 


2/25/1999 


60/121,825 


2750-0398 P 


80098.001 


2/25/1 999 


60/121,704 


2750-0399P 


80099.001 


2/26/1999 


60/122,107 


2750-0400P 


80100.001 


3/1/1999 


60/122,266 


2750-0401 P 


80101.001 


3/2/1999 


60/122,568 


2750-0402 P 


80102.001 


3/3/1999 


60/122,611 


2750-0403 P 


80103.001 


3/4/1999 


60/121,775 
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2750-0405P 


80105.001 


3/5/1999 


60/123,180 


2750-0404P 


80104.001 


3/5/1999 


60/123,534 


2750-0406P 


80106.001 


3/9/1999 


60/123,680 


2750-0407P 


80107.001 


3/9/1999 


60/123,548 


2750-0408P 


80108.001 


3/10/1999 


60/123,715 


2750-0409P 


80109.001 ; 


3/10/1999 


60/123,726 


2750-041 OP 


80110.001 


3/11/1999 


60/124,263 


2750-041 1 P 


801 1 1 .001 


3/12/1999 ; 


60/123,941 


2750-041 2P 


80112.001 


3/23/1999 


60/125,788 


2750-041 3P 


80113.001 


3/25/1 999 


60/126,264 


2750-041 4P 


80114.001 


3/29/1999 


60/126,785 


2750-041 5P 


80115.001 


4/1/1999 


60/127,462 


2750-041 6P 


91000.001 


4/6/1999 


60/128,234 


2750-041 7P 


91001.001 


4/8/1999 


60/128,714 


2750-041 8P 


80118.001 


4/16/1999 


60/129,845 


2750-0420 P 


80120.001 


4/19/1999 ; 


60/130,077 


2750-0421 P 


80121.001 


4/21/1999 


60/130,449 


2750-0303P 


80115.002 


4/23/1999 


60/130,510 


2750-0422P 


80122.001 


4/23/1999 


60/130,891 


2750-0423 P 


80123.001 


4/28/1999 


60/131,449 


2750-0424P 


80124.001 


4/30/1999 


60/132,407 


2750-0425 P 


80125.001 


4/30/1999 


60/132,048 


2750-0426 P 


80126.001 


5/4/1999 ; 


60/1 32,484 


2750-0427P 


80127.001 


5/5/1999 


60/1 32,485 


2750-0428P 


91002.001 


5/6/1999 


60/132,487 


2750-0429P 


80129.001 


5/6/1999 s 


60/132,486 


2750-0430 P 


80130.001 


5/7/1999 


60/132,863 


2750-0431 P 


80131.001 


5/11/1999 ' 


60/134,256 


2750-0433P 


00025.001 


5/14/1999 


60/134,221 


2750-0432P 


91006.001 


5/14/1999 ; 


60/134,370 


2750-0434P 


80116.001 


5/14/1999 


60/134,219 


2750-0435P 


80117.001 


5/14/1999 


60/134,218 


2750-0436P 


91007.001 


5/18/1999 


60/134,768 


2750-0437P 


91008.001 


5/19/1999 


60/134,941 


2750-0438P 


91009.001 


5/20/1999 


60/135,124 


2750-0439P 


91010.001 


5/21/1999 


60/135,353 


2750-0440P 


91011.001 


5/24/1 999 


60/135,629 


2750-0441 P 


91012.001 


5/25/1 999 


60/136,021 


2750-0442P 


91013.001 


5/27/1999 


60/136,392 


2750-0444P 


91014.001 


5/28/1999 


60/136,782 


2750-0445P 


91015.001 


6/1/1 999 


60/137,222 


2750-0446P 


91016.001 


6/3/1999 


60/137,528 


2750-0447P 


91017.001 


6/4/1 999 


60/137,502 


2750-0449P 


91018.001 


6/7/1999 


60/137,724 


2750-0450P 


91019.001 


6/8/1999 


60/138,094 


2750-0457P 


00033.001 


6/10/1999 


60/138,540 


2750-0458P 


00033.002 


6/10/1999 


60/138,847 


2750-0463P 


00034.001 


6/14/1 999 


60/139,119 


2750-0461 P 


80132.011 


6/16/1999 


60/139,453 
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2750-0462P 


80132.012 


6/1 6/1 999 


60/139,452 


2750-0464P 


00037.001 


6/1 7/1 999 


60/1 39,492 


2750-0453 P 


80132.005 


6/1 8/1 999 


60/139,462 


2750:p466P 


00039.001 


6/1 8/1999 


60/139,750 


2750-0465P 


00038.001 


6/1 8/1999 


60/139,763 


2750-0460P 


80132.010 


6/1 8/1 999 


60/139,455 


2750-0451 P 


80132.003 


6/1 8/1 999 


60/139,459 


2750-0454P 


80132.006 


6/18/1999 


60/139,457 


2750-0459P 


80132.009 


6/1 8/1 999 


60/139,463 


2750-0448P 


80132.002 


6/1 8/1 999 


60/139,454 


2750-0443P 


80132.001 


6/1 8/1 999 


60/139,458 


2750-0456P 


80132.008 


6/1 8/1 999 


60/1 39,456 


2750-0455P 


80132.007 


6/18/1999 


60/139,460 


2750-0452 P 


80132.004 


6/1 8/1 999 


60/139,461 


2750-0467P 


00042.001 


6/21/1999 


60/139,817 


2750-0468P 


00043.001 


6/22/1 999 


60/1 39,899 


2750-0470P 


00042.002 


6/23/1999 


60/140,353 


2750-0469P 


00044.001 


6/23/1 999 


60/140,354 


■ 2750-0471 P 


00045.001 


6/24/1 999 


60/140,695 


2750-0472P 


00046.001 


6/28/1 999 


60/1 40,823 


2750-0473P 


00048.001 


6/29/1 999 


60/140,991 


2750-0474P 


00049.001 


6/30/1 999 


60/141 ,287 


2750-0475P 


00050.001 


7/1 /1 999 


60/141,842 


2750-0476P 


00051 .001 


7/1 /1 999 


60/142,154 


2750-0477P 


00052.001 


7/2/1 999 


60/142,055 


2750-0478P 


00053.001 


7/6/1 999 


60/142,390 


; 2750-0479P 


00054.001 


7/8/1 999 


60/142!803 


2750-0480P 


00058.001 


7/9/1 999 


60/142,920 


2750-0481 P 


00059.001 


7/1 2/1 999 


60/142,977 


2750-0482P 


00060.001 


7/1 3/1 999 


60/143,542 


2750-0489 P 


00061 .001 


7/1 4/1 999 


60/143,624 


2750-0490P 


00062.001 


7/15/1999 


60/144,005 


2750-0485P 


80134.003 


7/16/1 999 


60/144,086 


2750-0486P 


80134.004 


7/16/1999 


60/144,085 


2750-0497P 


00064.001 


7/19/1 999 


60/144,325 


2750-0496P 


80134.014 


7/19/1999 


60/1 44,334 


2750-0495 P 


80134.013 


7/19/1 999 


60/144,335 


2750-0494P 


80134.010 


7/19/1 999 


60/144,333 


2750-0492 P 


80134.008 


7/19/1 999 


60/1 44^331 


2750-0488P 


80134.006 


7/1 9/1 999 


60/1 44,332 


2750-0500P 


00065.001 


7/20/1 999 


60/144,632 


2750-0502P 


80135.002 


7/20/1 999 


60/144,884 


2750-0499P 


80134.012 


7/20/1999 


60/144,352 


2750-0503P 


00066.001 


7/21/1999 


60/144,814 


2750-0483P 


80134.001 


7/21/1999 


60/145,088 


2750-0484P 


80134.002 


7/21/1999 


60/145,086 


2750-0504P 


00067.001 


7/22/1999 


60/145,192 


2750-0491 P 


80134.007 


7/22/1999 


60/145,085 


2750-0493P 


80134.009 


7/22/1999 


60/145,087 
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2750-0487P 


80134.005 


7/22/1999 


60/145,089 


2750-0498P 


80134.011 


7/23/1 999 ; 


60/145,145 


2750-0501 P 


: 80135.001 


7/23/1999 


60/145,224 


2750-0505P 


00069.001 


7/23/1 999 


60/145,218 


2750-0506P 


00070.001 


7/26/1 999 


60/145,276 


2750-0507P 


80136.001 


7/27/1999 


60/145,918 


2750-0508P 


80136.002 


7/27/1 999 


60/145,919 


2750-0509P 


00071 .001 


7/27/1 999 


60/145,913 


2750-051 OP 


00072.001 


7/28/1 999 


60/145,951 


2750-051 3P 


00073.001 


8/2/1999 


60/146,386 


2750-051 2P 


80137.002 


8/2/1 999 


60/146,389 


2750-051 IP 


80137.001 


8/2/1 999 


60/146,388 


2750-051 4P 


00074.001 


8/3/1 999 


60/147,038 


2750-051 5P 


00076.001 


8/4/1 999 


60/147,204 


2750-051 7P 


80138.002 


8/4/1 999 


60/147,302 


2750-051 9P 


80136.003 


8/5/1 999 


60/147,192 


2750-051 BP 


00077.001 


8/5/1 999 


60/147,260 


2750-0520P 


00079.001 


8/6/1 999 


60/147,416 


2750-051 6P 


80138.001 


8/6/1 999 


60/147,303 


2750-0523 P 


80139.002 


8/9/1 999 


60/147,935 


2750-0521 P 


00080.001 


8/9/1 999 


60/147,493 


2750-0522P 


80139.001 


8/10/1999 


60/148,171 


2750-0524P 


00081 .001 


8/11/1 999 


60/148,319 


2750-0530P 


00082.001 


8/12/1 999 


60/1 48,341 


2750-0525P 


80141 .001 


8/1 2/1 999 


60/1 48,347 


2750-0526P 


80141 .002 


8/1 2/1 999 


60/148,342 


2750-0527P 


80141.003 


8/12/1 999 


60/148,340 


2750-0528 P 


80141 .004 


8/1 2/1 999 


60/148,337 


2750-0532P 


80142.002 


8/13/1999 


60/148,684 


2750-0529P 


00083.001 


8/1 3/1 999 


60/148,565 


2750-0531 P 


80142.001 


8/1 6/1 999 


60/149,368 


2750-0533P 


80001 .002 


8/17/1 999 


60/149,927 


2750-0534P 


80001 .003 


8/17/1999 


60/1 49,928 


2750-0535 P 


80001 .004 


8/1 7/1 999 


60/1 49,926 


2750-0536P 


80001 .005 


8/17/1999 


60/149,925 


2750-0537P 


00084.001 


8/1 7/1 999 


60/149,175 


2750-0538P 


00085.001 


8/18/1999 


60/149,426 


2750-0542P 


00087.001 


8/20/1 999 


60/149,723 


2750-0541 P 


80143.002 


8/20/1999 


60/149,929 


2750-0539P 


00086.001 


8/20/1999 


60/149,722 


2750-0543P 


00088.001 


8/23/1 999 


60/149,902 


2750-0540P 


80143.001 


8/23/1999 


60/1 49,930 


2750-0544P 


00089.001 


8/25/1 999 


60/150,566 


2750-0547P 


00090.001 


8/26/1999 


60/1 50,884 


2750-0546P 


80144.002 


8/27/1999 


60/151,066 


2750-0548P 


00091 .001 


8/27/1999 


60/151,080 


2750-0545P 


80144.001 


8/27/1999 


60/151,065 


2750-0549P 


00092.001 


8/30/1999 


60/151,303 


2750-0552P 


00093.001 


8/31/1999 


60/151,438 
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2750-0553P 


00094.001 


9/1/1999 


60/151,930 


2750-0550P 


80001 .006 


9/3/1 999 


09/391,631 


2750-0551 F(PC) 


80001.100 


9/3/1999 


99/204,38 


2750-0554P 


00095.001 


9/7/1999 


60/152,363 


2750-0555 P 


00096.001 


9/10/1 999 


60/153,070 


2750-0556P 


00098.001 


9/13/1 999 


60/153,758 


2750-0557P 


00099.001 


9/15/1999 


60/154,018 


2750-0558 P 


00101.001 


9/16/1999 


60/154,039 


2750-0559P 


00102.001 


9/20/1 999 


60/1 54,779 


2750-0560P 


00103.001 


9/22/1 999 


60/155,139 


2750-0561 P 


00104.001 


9/23/1 999 


60/155,486 


2750-0562 P 


00105.001 


9/24/1999 


60/155,659 


2750-0563P 


00106.001 


9/28/1 999 


60/156,458 


2750-0564 P 


00107.001 


9/29/1 999 


60/156,596 


2750-0570P 


00108.001 


1 0/4/1 999 


60/157,117 


2750-0571 P 


00109.001 


1 0/5/1 999 


60/157,753 


2750-0565P 


80010.002 


1 0/5/1 999 


09/413,198 


2750-0566P 


80010.003 


1 0/5/1 999 


09/41 2,922 


2750-0567F(PC) 


80010.100 


1 0/5/1 999 


99/228,55 


2750-0568F(PC) 


80010.101 


1 0/5/1 999 


99/228,54 


2750-0569F{PC) 


80010.102 


10/5/1999 


99/228,53 


2750-0572 P 


00110.001 


10/6/1999 


60/1 57,865 


2750-0575 P 


001 1 1 .001 


1 0/7/1 999 


60/158,029 


2750-0576P 


00112.001 


1 0/8/1 999 


60/158,232 


2750-0577P 


00113.001 


10/12/1999 


60/158,369 


2750-0574P 


80145.002 


10/13/1999 


60/1 59,295 


2750-0579P 


80146.002 


10/13/1999 


60/159,293 


2750 0583 P 


80148.002 


10/13/1999 


60/159,294 


2750 0573 P 


80145.001 


10/14/1999 


60/1 59,330 


2750-0580P 


80147.001 


10/14/1999 


60/1 59,638 


2750-0581 P 


80147.002 


10/1 4/1999 


60/1 59,637 


2750-0582P 


80148.001 


10/14/1999 


60/1 59,329 


2750-0578P 


80146.001 


10/14/1999 


60/159,331 


2750-0584P 


00116.001 


10/18/1999 


60/159,584 


2750-0586P 


80149.001 


10/21/1999 : 


60/160,814 


2750-0587P 


80149.002 


10/21/1999 


60/160,770 


2750-0588P 


001 19.001 


10/21/1999 


60/1 60,741 


2750-0589P 


80150.001 


10/21/1999 


60/160,768 


2750-0590P 


80150.002 


10/21/1999 


60/160,767 


2750-0585P 


00118.001 


10/21/1999 


60/160,815 


2750-0593 P 


80151.002 


1 0/22/1 999 


60/160,981 


2750-0591 P 


00120.001 


10/22/1999 


60/160,980 


2750-0592 P 


80151.001 


10/22/1999 


60/160,989 


2750-0596P 


80152.002 


10/25/1999 


60/161,404 


2750-0595P 


80152.001 


10/25/1999 


60/161,406 


2750-0594P 


00121.001 


10/25/1999 


60/161,405 


2750-0597P 


00122.001 


10/26/1999 


60/161,361 


2750-0598P 


80153.001 


10/26/1999 


60/161,360 


2750-0599P 


80153.002 


10/26/1999 


60/161,359 
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2750-0600P 


80026.002 


10/28/1999 


09/428,944 


2750-0601 P 


00123.001 


10/28/1999 


60/161,920 


2750-0602P 


, 80154.001 


10/28/1999 


60/161,992 


2750-0603P 


80154.002 


10/28/1999 


60/161,993 


2750-0604P 


00124.001 


10/29/1999 


60/162,143 


2750-0605P 


80155.001 


10/29/1999 


60/162,142 


2750-G606P 


80155.002 


10/29/1999 


60/162,228 


2750-0607P 


00125.001 


11/1/1999 


60/162,894 


2750-0608P 


80156.001 


11/1/1999 


60/1 62,891 


2750-0609P 


80156.002 


11/1/1999 


60/162,895 


2750-061 OP 


00126.001 


11/2/1999 


60/163,093 


2750-061 1 P 


80157.001 


11/2/1999 


60/163,092 


2750-061 2P 


80157.002 


11/2/1999 


60/163,091 


2750-061 4P 


80158.001 


11/3/1 999 


60/163,248 


2750-061 5P 


80158.002 


11/3/1999 


60/163,281 


2750-061 3P 


00127.001 


11/3/1999 


60/163,249 


2750-061 8P 


80159.002 


11/4/1999 


60/163,380 


2750-061 7P 


80159.001 


11/4/1 999 


60/163,381 


2750-061 6P 


00128.001 


11/4/1999 


60/163,379 


2750-0621 P 


80160.002 


11/8/1999 


60/164,150 


2750-0620P 


80160.001 


11/8/1999 


60/164,151 


2750-061 9P 


00129.001 


11/8/1 999 


60/164,146 


2750-0623P 


80161.002 


1 1 /9/1 999 


60/164,260 


2750-0625P 


80162.002 


11/9/1999 


60/1 64,259 


2750-0626P 


80163.001 


11/10/1 999 


60/164,321 


2750-0630P 


80164.002 


11/10/1999 


60/164,548 


2750-0629P 


80164.001 


11/10/1999 


60/164,545 


2750-0627 P 


80163.002 


11/10/1999 ' 


60/164,318 


2750-0624 P 


80162.001 


11/10/1999 


60/164,317 


2750-0622P 


80161.001 


11/10/1999 


60/164,319 


2750-0628P 


00131.001 


11/10/1999 


60/164,544 


2750-0636P 


80166.002 


11/12/1999 


60/164,962 


2750-0633 P 


80165.002 


11/12/1 999 


60/1 64,960 


2750-0634P 


00133.001 


11/12/1999 


60/1 64,870 


2750-0632 P 


80165.001 


11/12/1999 


60/164,871 


2750-0631 P 


00132.001 


11/12/1999 


60/164,961 


2750-0635 P 


80166.001 


11/12/1999 


60/164,959 


2750-0637P 


00134.001 


11/15/1999 


60/164,927 


2750-0638P 


80167.001 


11/15/1999 


60/164,929 


2750-0639 P 


80167.002 


1 1 /1 5/1 999 


60/164,926 


2750-0640P 


00135.001 


11/16/1999 


60/165,669 


2750-0642P 


80168.002 


11/16/1999 


60/165,661 


2750-0641 P 


80168.001 


11/16/1999 


60/165,671 


2750-0643P 


00136.001 


11/1 7/1999 


60/165,919 


2750-0644P 


80169.001 


11/17/1999 


60/165,918 


2750-0645 P 


80169.002 


11/17/1999 


60/165,911 


2750-0646P 


00137.001 


11/18/1999 


60/166,157 


2750-0647P 


80170.001 


11/18/1999 


60/166,173 


2750-0648P 


80170.002 


11/18/1999 


60/166,158 
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2750-0650P 


80171.001 


11/19/1999 


60/166,411 


2750-0649P 


00139.001 


11/19/1999 


60/166,419 


2750-0651 P 


80171.002 


11/19/1999 


60/166,412 


2750-0653 P 


80172.001 


11/22/1999 


60/166,750 


2750-0652P 


00140.001 


11/22/1999 


60/166,733 


2750-0655 P 


80173.002 


11/23/1999 


60/167,362 


2750-0654P 


80173.001 


11/24/1999 


60/167,382 


2750-0657P 


80174.001 


11/24/1999 


60/167,234 


2750-0656P 


00141.001 


11/24/1999 


60/167,233 


2750-0658P 


80174.002 


11/24/1999 


60/167,235 


2750-0660P 


80175.001 


11/30/1999 


60/167,908 


2750-0659P 


00142.001 


11/30/1999 


60/167,904 


2750-0661 P 


80175.002 


11/30/1999 


60/167,902 


2750-0664P 


80176.001 


12/1/1999 


60/168,233 


2750-0662P 


80042.002 


12/1/1999 


09/451 ,320 


2750-0665P 


80176.002 


12/1/1999 


60/168,231 


2750-0663 P 


00143.001 


12/1/1999 


60/168,232 


2750-0668 P 


80177.002 


12/2/1999 


60/168,548 


2750-0667 P 


80177.001 


12/2/1999 


60/168,549 


2750-0666P 


00144.001 


1 2/2/1 999 


60/168,546 


2750-0669P 


00145.001 


12/3/1999 


60/168,675 


2750-0670P 


80178.001 


12/3/1999 


60/168,673 


2750-0671 P 


80178.002 


1 2/3/1 999 


60/168,674 


2750-0673 P 


80179.001 


12/7/1999 


60/169,278 


2750-0672 P 


00147.001 


12/7/1999 ' 


60/169,298 


2750-0674P 


80179.002 


12/7/1999 


60/169,302 


2750-0675P 


80180.001 


12/8/1999 


60/169,692 


2750-0676P 


80180.002 


12/8/1999 


60/169,691 


2750-0677P 


00149.001 


12/16/1999 


60/171,107 


2750-0678P 


80181.001 


12/16/1999 


60/171,114 


2750-0679P 


80181.002 


12/16/1999 


60/171,098 


2750-0683P 


80060.002 


1/4/2000 


09/478,081 


2750-0686F(PC) 


80070.100 


1/7/2000 


00/004,66 


2750-0684P 


80070.002 


1/7/2000 


09/479,221 


2750-0685P 


80183.002 


1/19/2000 


60/176,867 


2750-0688P 


80184.002 


1/19/2000 


60/176,910 


2750-0681 P 


80182.002 


1/19/2000 


60/176,866 


2750-0689P 


00152.001 


1/26/2000 


60/178,166 


2750-0691 P 


80185.001 


1/27/2000 


60/177,666 


2750-0687P 


80184.001 


1/27/2000 


60/178,545 


2750-0682P 


80183.001 


1/27/2000 


60/178,546 


2750-0680P 


80182.001 


1/27/2000 


60/178,544 


2750-0690P 


00153.001 


1/27/2000 


60/178,547 


2750-0692 P 


00155.001 


1/28/2000 


60/178,754 


2750-0693 P 


80186.001 


1/28/2000 


60/178,755 


2750-0695P 


00157.001 


2/1/2000 


60/1 79,395 


2750-0696P 


80187.001 


2/1/2000 


60/1 79,388 


2750-0694P 


80084.002 


2/3/2000 


09/497,191 


2750-0697 P 


00158.001 


2/3/2000 


60/180,039 
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2750-0698P 


80188.001 


2/3/2000 


60/180,139 


2750-0699P 


00159.001 


2/4/2000 


60/1 80,206 


2750-0700P 


80189.001 


2/4/2000 


60/180,207 


2750-0701 P 


00160.001 


2/7/2000 


60/180,695 


2750-0702 P 


80190.001 


2/7/2000 


60/180,696 


2750-0703P 


00161.001 


2/9/2000 


60/181,228 


2750-0704 P 


80191.001 


2/9/2000 


60/181,214 


2750-0705 P 


00162.001 


2/1 0/2000 


60/181,476 


2750-0706P 


80192.001 


2/10/2000 


60/181,551 


2750-0707P 


00163.001 


2/15/2000 


60/1 82,477 


2750-0708P 


80193.001 


2/1 5/2000 


60/1 82,516 


2750-071 2P 


00164.001 


2/15/2000 


60/182,512 


2750-071 3 P 


80194.001 


2/1 5/2000 


60/182,478 


2750-071 5P 


80195.001 


2/1 7/2000 


60/183,165 


2750-071 4P 


00165.001 


2/1 7/2000 


60/183,166 


2750-071 7P 


80196.001 


2/24/2000 


60/1 84,658 


2750-071 6P 


00167.001 


2/24/2000 


60/184,667 


2750-0709 F(C A) 


80090.102 


2/25/2000 


23/006,92 


2750-071 9P 


00168.001 


2/25/2000 


60/185,1 18 


2750-071 8P 


91022.001 


2/25/2000 


60/185,140 


2750-0720P 


80197.001 


2/25/2000 


60/185,1 19 


2750-0709F(MX) 


80090.101 


2/25/2000 


00/001 ,973 


2750-0709F(EP) 


80090.103 


2/25/2000 


00/30i,439 


2750-0709P 


80090.002 


2/25/2000 


09/513,996 


2750-0721 P 


91023.001 


2/28/2000 


60/185,398 


2750-0722P 


00169.001 


2/28/2000 


60/1 85,396 


?75n-n7?'^P 


80198.001 


2/28/2000 


60/185,397 


2750-0724P 


91024.001 


2/29/2000 


60/1 85750 


2750-0727P 


91025.001 


3/1 /2000 


60/1 86,277 


2750-0725P 


00170.001 


3/1/2000 




2750-0726P 


80199.001 


3/1/2000 


60/186,296 


2750 071 OP 


80100.002 


3/1 /2000 


09/517,537 


2750-0728P 


80200.001 


3/2/2000 


60/187,178 


2750-0729P 


00172.001 


3/2/2000 


60/1 861386 


2750-0730P 


80201 .001 


3/2/2000 


60/186,387 


2750-071 1 P 


00171.001 


3/2/2000 


60/186,390 


2750-0733P 


80202.001 


3/3/2000 


60/186!669 


2750-0731 P 


91026.001 


3/3/2000 


60/1 86,670 


2750-0732P 


00173.001 


3/3/2000 


60/1 86,748 


2750-0734P 


00174.001 


3/7/2000 


60/187,378 


2750-0735P 


91027.001 


3/7/2000 


60/187,379 


2750-0736P 


00175.001 


3/8/2000 


60/187,896 


2750-0737P 


80203.001 


3/8/2000 


60/187,888 


2750-0738P 


91028.001 


3/9/2000 


60/187,985 


2750-0739P 


00177.001 


3/10/2000 


60/188,187 


2750-0741 P 


91030.001 


3/1 0/2000 




2750-0740P 


80204.001 


3/1 0/2000 


60/188,186 


2750-0742P 


00178.001 


3/10/2000 


60/188,185 


2750-0743P 


80205.001 


3/10/2000 


60/188,175 
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2750-0744P 


91031.001 


3/13/2000 


60/188,687 


2750-0745P 


00179.001 


3/1 4/2000 


60/189,080 


2750-0746P 


80206.001 


3/14/2000 


60/189,052 


2750-b749P 


80207.001 


3/1 5/2000 


60/189,462 


2750-0748 P 


00180.001 


3/1 5/2000 


60/189,461 


2750 P747P 


91032.001 


3/15/2000 


60/189,460 


2750-0753P 


8021 1 .001 


3/1 6/2000 


60/190,121 


2750-0751 P 


80209.001 


3/16/2000 


60/189,947 


2750-0750P 


80208.001 


3/16/2000 . 


60/190,120 


2750-0756P 


80212.001 


3/1 6/2000 


60/189,959 


2750-0752P 


80210.001 


3/1 6/2000 


60/1 89,948 


2750-0757P 


91034.001 


3/16/2000 


60/189,965 


2750-0754P 


91033.001 


3/16/2000 


60/189,958 


2750-0755 P 


00181.001 


3/1 6/2000 


60/189,953 


2750-0762P 


80214.001 


3/20/2000 


60/190,089 


2750-0761 P 


00183.001 


3/20/2000 


60/190,545 


2750-0760P 


91035.001 


3/20/2000 


60/190,060 


2750-0759P 


80213.001 


3/20/2000 . 


60/1 90,070 


2750 0758 P 


00182.001 


3/20/2000 


60/190,069 


2750-0764P 


80215.001 


3/22/2000 


60/191,097 


2750-0763P 


00184.001 


3/22/2000 


60/191,084 


2750 6766P 


00185.001 


3/23/2000 


60/191,543 


2750-0765P 


; 91036.001 


3/23/2000 


60/191,549 


2750-0767P 


80216.001 


3/23/2000 


60/191,545 


2750-0770P 


80217.001 


3/24/2000 


60/191,825 


2750-0768P 


91037.001 


3/24/2000 


60/191,826 


2750-0769P 


00186.001 


3/24/2000 


60/191,823 


2750-0772P 


00187.001 


3/27/2000 


60/192,421 


2750-0773P 


80218.001 


3/27/2000 


60/1 92,308 


2750-0771 P 


91038.001 


3/27/2000 


60/192,420 


2750-0774P 


91 039.001 


3/29/2000 


60/192,855 


2750-0775P 


00188.001 


3/29/2000 


60/192,940 


2750-0776P 


80219.001 


3/29/2000 


60/192,941 


2750-0778P 


00189.001 


3/30/2000 


60/193,244 


2750-0777P 


91040.001 


3/30/2000 


60/193,243 


2750-0779P 


80220.001 


3/30/2000 




2750-0781 P 


00190.001 


3/31/2000 


60/193,453 


2750-0780P 


91041.001 


3/31/2000 


60/193,469 


2750-0782P 


80221 .001 


3/31/2000 


60/193,455 


2750-0786P 


00191.001 


4/4/2000 




2750-0787P 


80222.001 


4/4/2000 




2750-0785P 


91042.001 


4/4/2000 




2750-0789P 
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2750-0928P 00033.003 6/9/2000 

2750-0929P 91071.001 6/8/2000 

2750-0930P 00234.001 6/8/2000 

2750-0931 P 80271.001 6/8/2000 

2750-0932P 00235.001 6/9/2000 

2750-0933 P 80272.001 6/9/2000 

All applications listed in the table above are expressly incorporated herein by 
reference in their entirety and for all purposes. 

The SDFs of the invention can also be used as probes to search for genes that are 
related to the SDF within a species. Such related genes are typically considered to be 
members of a gene family. In such a case, the sequence similarity will often be concentrated 
into one or a few fragments of the sequence. The fragments of similar sequence that define 
the gene family typically encode a fragment of a protein or RNA that has an enzymatic or 
structural function. The percentage of identity in the amino acid sequence of the domain that 
defines the gene family is preferably at least 70%, more preferably 80 to 95%, most 
preferably 85 to 99%. To search for members of a gene family within a species, a low 
stringency hybridization is usually performed, but this will depend upon the size, distribution 
and degree of sequence divergence of domains that define the gene family. SDFs 
encompassing regulatory regions can be used to identify coordinately expressed genes by 
using the regulatory region sequence of the SDF as a probe. 

In the instances where the SDFs are identified as being expressed from genes that confer 
a particular phenotype, then the SDFs can also be used as probes to assay plants of different 
species for those phenotypes. 



I.e. Methods to Inhibit Gene Expression 
2 0 The nucleic acid molecules of the present invention can be used to inhibit gene 

transcription and/or translation. Example of such methods include, without limitation: 

Antisense Constructs; 

Ribozyme Constructs; 

Chimeraplast Constructs; 
25 Co-Suppression; 

Transcriptional Silencing; and 

Other Methods of Gene Expression. 



Reference No. 2750-942P 



41 

C.l Antisense 

In some instances it is desirable to suppress expression of an endogenous or 
exogenous gene. A well-known instance is the FLAVOR-SAVOR^'^ tomato, in which the 
gene encoding ACC synthase is inactivated by an antisense approach, thus delaying softening 
of the fruit after ripening. See for example, U.S. Patent No. 5,859,330; U.S. Patent No. 
5,723,766; Oeller, et al. Science, 254:437-439(1991); and Hamilton et al, Nature, 346:284- 
287 (1990). Also, timing of flowering can be controlled by suppression of the FLOWERING 
LOCUS C {FLO}; high levels of this transcript are associated with late flowering, while 
absence of FLC is associated with early flowering (S.D. Michaels et al., Plant Cell 11:949 
(1999). Also, the transition of apical meristem from production of leaves with associated 
shoots to flowering is regulated by TERMINAL FLOWERl, APETALAl and LEAFY. Thus, 
when it is desired to induce a transition from shoot production to flowering, it is desirable to 
suppress TFLl expression (S.J. Liljegren, Plant Cell 11:1007 (1999)). As another instance, 
arrested ovule development and female sterility result from suppression of the ethylene 
forming enzyme but can be reversed by application of ethylene (D. De Martinis et al., Plant 
Cell 11:1061 (1999)). The ability to manipulate female fertility of plants is useful in 
increasing fruit production and creating hybrids. 

In the case of polynucleotides used to inhibit expression of an endogenous gene, the 
introduced sequence need not be perfectly identical to a sequence of the target endogenous gene. 
The introduced polynucleotide sequence will typically be at least substantially identical to the 
target endogenous sequence. 

Some polynucleotide SDFs in Table 1 represent sequences that are expressed in 
corn,wheat, rice, soybean Arabidopsis and/or other plants. Thus the invention includes using 
these sequences to generate antisense constructs to inhibit translation and/or degradation of 
transcripts of said SDFs, typically in a plant cell. 

To accomplish this, a polynucleotide segment from the desired gene that can hybridize to 
the mRNA expressed from the desired gene (the "antisense segment") is operably linked to a 
promoter such that the antisense strand of RNA will be transcribed when the construct is present 
in a host cell. A regulated promoter can be used in the construct to control transcription of the 
antisense segment so that transcription occurs only under desired circumstances. 
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The antisense segment to be introduced generally will be substantially identical to at 
least a fragment of the endogenous gene or genes to be repressed. The sequence, however, need 
not be perfectly identical to inhibit expression. Further, the antisense product may hybridize to 
the untranslated region instead of or in addition to the coding sequence of the gene. The vectors 
of the present invention can be designed such that the inhibitory effect applies to other proteins 
within a family of genes exhibiting homology or substantial homology to the target gene. 

For antisense suppression, the introduced antisense segment sequence also need not 
be full length relative to either the primary transcription product or the fully processed 
mRNA. Generally, a higher percentage of sequence identity can be used to compensate for 
the use of a shorter sequence. Furthermore, the introduced sequence need not have the same 
intron or exon pattern, and homology of non-coding segments may be equally effective. 
Normally, a sequence of between about 30 or 40 nucleotides and the full length of the 
transcript canbe used, though a sequence of at least about 100 nucleotides is preferred, a 
sequence of at least about 200 nucleotides is more preferred, and a sequence of at least about 
500 nucleotides is especially preferred. 

C.2. Ribozymes 

It is also contemplated that gene constructs representing ribozymes and based on the 
SDFs in TABLE 1 are an object of the invention. Ribozymes can also be used to inhibit 
expression of genes by suppressing the translation of the mRNA into a polypeptide. It is 
possible to design ribozymes that specifically pair with virtually any target RNA and cleave the 
phosphodiester backbone at a specific location, thereby functionally inactivating the target RNA. 
In carrying out this cleavage, the ribozyme is not itself altered, and is thus capable of recycling 
and cleaving other molecules, making it a true enzyme. The inclusion of ribozyme sequences 
within antisense RNAs confers RNA-cleaving activity upon them, thereby increasing the 
activity of the constructs. 

A number of classes of ribozymes have been identified. One class of ribozymes is 
derived from a number of small circular RNAs, which are capable of self-cleavage and 
replication in plants. The RNAs replicate either alone (viroid RNAs) or with a helper virus 
(satellite RNAs). Examples include RNAs from avocado sunblotch viroid and the satellite 
RNAs from tobacco ringspot virus, lucerne transient streak virus, velvet tobacco mottle virus. 
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solanum nodiflomm mottle virus and subterranean clover mottle virus. The design and use of 
target RNA-specific ribozymes is described in Haseloff et al. Nature, 334 :585 (1988). 

Like the antisense constructs above, the ribozyme sequence fragment necessary for 
pairing need not be identical to the target nucleotides to be cleaved, nor identical to the 
sequences in TABLE 1. Ribozymes may be constructed by combining the ribozyme 
sequence and some fragment of the target gene which would allow recognition of the target 
gene mRNA by the resulting ribozyme molecule. Generally, the sequence in the ribozyme 
capable of binding to the target sequence exhibits a percentage of sequence identity with at 
least 80%, preferably with at least 85%, more preferably with at least 90% and most preferably 
with at least 95%, even more preferably, with at least 96%, 97%, 98% or 99% sequence identity 
to some fragment of a sequence in TABLE 1 or the complement thereof. The ribozyme can 
be equally effective in inhibiting mRNA translation by cleaving either in the untranslated or 
coding regions. Generally, a higher percentage of sequence identity can be used to 
compensate for the use of a shorter sequence. Furthermore, the introduced sequence need not 
have the same intron or exon pattern, and homology of non-coding segments may be equally 
effective. 

C.3. Chimeraplasts 

The SDFs of the invention, such as those described by Table 1, can also be used to 
construct chimeraplasts that can be introduced into a cell to produce at least one specific 
nucleotide change in a sequence corresponding to the SDF of the invention. A chimeraplast 
is an oligonucleotide comprising DNA and/or RNA that specifically hybridizes to a target 
region in a manner which creates a mismatched base-pair. This mismatched base-pair signals 
the cell's repair enzyme machinery which acts on the mismatched region resulting in the 
replacement, insertion or deletion of designated nucleotide(s). The altered sequence is then 
expressed by the cell's normal cellular mechanisms. Chimeraplasts can be designed to repair 
mutant genes, modify genes, introduce site-specific mutations, and/or act to interrupt or alter 
normal gene function (US Pat. Nos. 6,010,907 and 6,004,804; and PCT Pub. No. 
W099/58723 and WO99/07865). 

C.4. Sense Suppression 

The SDFs of Table 1 of the present invention are also useful to modulate gene 
expression by sense suppression. Sense suppression represents another method of gene 
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suppression by introducing at least one exogenous copy or fragment of the endogenous 
sequence to be suppressed. 

Introduction of expression cassettes in which a nucleic acid is configured in the sense 
orientation with respect to the promoter into the chromosome of a plant or by a self-replicating 
virus has been shown to be an effective means by which to induce degradation of mRNAs of 
target genes. For an example of the use of this method to modulate expression of endogenous 
genes see, Napoli et al., The Plant Cell 2:279 (1990), and U.S. Patents Nos. 5,034,323, 
5,231,020, and 5,283,184. Inhibition of expression may require some transcription of the 
introduced sequence. 

For sense suppression, the introduced sequence generally will be substantially identical 
to the endogenous sequence intended to be inactivated. The minimal percentage of sequence 
identity will typically be greater than about 65%, but a higher percentage of sequence identity 
might exert a more effective reduction in the level of normal gene products. Sequence identity 
of more than about 80% is preferred, though about 95% to absolute identity would be most 
preferred. As with antisense regulation, the effect would likely apply to any other proteins 
within a similar family of genes exhibiting homology or substantial homology to the suppressing 
sequence. 

C.5. Transcriptional Silencing 

The nucleic acid sequences of the invention, including the SDFs of Table 1, and 
fragments thereof, contain sequences that can be inserted into the genome of an organism 
resulting in transcriptional silencing. Such regulatory sequences need not be operatively linked 
to coding sequences to modulate transcription of a gene. Specifically, a promoter sequence 
without any other element of a gene can be introduced into a genome to transcriptionally silence 
an endogenous gene (see, for example, Vaucheret, H et al. (1998) The Plant Journal 16: 651- 
659). As another example, triple helices can be formed using oligonucleotides based on 
sequences from TABLE 1, fragments thereof, and substantially similar sequence thereto. The 
oligonucleotide can be delivered to the host cell and can bind to the promoter in the genome to 
form a triple helix and prevent transcription. An oligonucleotide of interest is one that can bind 
to the promoter and block binding of a transcription factor to the promoter. In such a case, the 
oligonucleotide can be complementary to the sequences of the promoter that interact with 
transcription binding factors. 
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C.6. Other Methods to Inhibit Gene Expression 

Yet another means of suppressing gene expression is to insert a polynucleotide into the 
gene of interest to disrupt transcription or translation of the gene. 
5 Low frequency homologous recombination can be used to target a polynucleotide insert 

to a gene by flanking the polynucleotide insert with sequences that are substantially similar to 
the gene to be disrupted. Sequences from TABLE 1, fragments thereof, and substantially similar 
sequence thereto can be used for homologous recombination. 

In addition, random insertion of polynucleotides into a host cell genome can also be used 
10 to disrupt the gene of interest. Azpiroz-Leehan et al.. Trends in Genetics 13:152 (1997). In this 
method, screening for clones from a library containing random insertions is preferred to 
identifying those that have polynucleotides inserted into the gene of interest. Such screening can 
be performed using probes and/or primers described above based on sequences from TABLE 1, 
fragments thereof, and substantially similar sequence thereto. The screening can also be 
1 5 performed by selecting clones or Ri plants having a desired phenotype. 

I.D. Methods of Functional Analysis 

The constructs described in the methods under I.C. above can be used to determine 
the function of the polypeptide encoded by the gene that is targeted by the constructs. 

Down-regulating the transcription and translation of the targeted gene in the host cell 
2 0 or organisms, such as a plant, may produce phenotypic changes as compared to a wild-type 
cell or organism. In addition, in vitro assays can be used to determine if any biological 
activity, such as calcium flux, DNA transcription, nucleotide incorporation, etc., are being 
modulated by the down-regulation of the targeted gene. 

Coordinated regulation of sets of genes, e.g., those contributing to a desired polygenic 

2 5 trait, is sometimes necessary to obtain a desired phenotype. SDFs of the invention 

representing transcription activation and DNA binding domains can be assembled into hybrid 
transcriptional activators. These hybrid transcriptional activators can be used with their 
corresponding DNA elements (i.e., those bound by the DNA-binding SDFs) to effect 
coordinated expression of desired genes (J.J. Schwarz et al., Mol. Cell. Biol. 12:266 (1992), 

3 0 A. Martinez et al., Mol. Gen. Genet. 261:546 (1999)). 
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The SDFs of the invention can also be used in the two-hybrid genetic systems to 
identify networks of protein-protein interactions (L. McAlister-Henn et al., Methods 19:330 
(1999), J.C. Hu et al.. Methods 20:80 (2000), M. Golovkin et al., /. Biol. Chem. 274:36428 
(1999), K. Ichimura et al., Biochem. Biophys. Res. Comm. 253:532 (1998)). The SDFs of the 
invention can also be used in various expression display methods to identify important 
protein-DNA interactions (e.g. B. Luo et al., J. Mol. Biol. 266:479 (1997)). 

I.E. Promoters 

The SDFs of the invention are also useful as structural or regulatory sequences in a 
construct for modulating the expression of the corresponding gene in a plant or other organism, 
e.g. a symbiotic bacterium. For example, promoter sequences associated to SDFs of Table 1 of 
the present invention can be useful in directing expression of coding sequences either as 
constitutive promoters or to direct expression in particular cell types, tissues, or organs or in 
response to environmental stimuli. 

With respect to the SDFs of the present invention a promoter is likely to be a relatively 
small portion of a genomic DNA (gDNA) sequence located in the first 2000 nucleotides 
upstream from an initial exon identified in a gDNA sequence or initial "ATG" or methionine 
codon or translational start site in a corresponding cDNA sequence. Such promoters are more 
likely to be found in the first 1000 nucleotides upstream of an initial ATG or methionine codon 
or translational start site of a cDNA sequence corresponding to a gDNA sequence. In particular, 
the promoter is usually located upstream of the transcription start site. The fragments of a 
particular gDNA sequence that function as elements of a promoter in a plant cell will preferably 
be found to hybridize to gDNA sequences presented and described in Table 1 at medium or high 
stringency, relevant to the length of the probe and its base composition. 

Promoters are generally modular in nature. Promoters can consist of a basal promoter 
that functions as a site for assembly of a transcription complex comprising an RNA polymerase, 
for example RNA polymerase II. A typical transcription complex will include additional factors 
such as TFiiB, TFnD, and TFnE. Of these, TFnD appears to be the only one to bind DNA 
direcdy. The promoter might also contain one or more enhancers and/or suppressors that 

function as binding sites for additional transcription factors that have the function of modulating 
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the level of transcription with respect to tissue specificity and of transcriptional responses to 
particular environmental or nutritional factors, and the like. 

Short DNA sequences representing binding sites for proteins can be separated from each 
other by intervening sequences of varying length. For example, within a particular functional 
module, protein binding sites may be constituted by regions of 5 to 60, preferably 10 to 30, more 
preferably 10 to 20 nucleotides. Within such binding sites, there are typically 2 to 6 nucleotides 
that specifically contact amino acids of the nucleic acid binding protein. The protein binding 
sites are usually separated from each other by 10 to several hundred nucleotides, typically by 15 
to 150 nucleotides, often by 20 to 50 nucleotides. DNA binding sites in promoter elements often 
display dyad symmetry in their sequence. Often elements binding several different proteins, 
and/or a plurality of sites that bind the same protein, will be combined in a region of 50 to 1,000 
basepairs. 

Elements that have transcription regulatory function can be isolated from their 
corresponding endogenous gene, or the desired sequence can be synthesized, and recombined in 
constructs to direct expression of a coding region of a gene in a desired tissue-specific, temporal- 
specific or other desired manner of inducibility or suppression. When hybridizations are 
performed to identify or isolate elements of a promoter by hybridization to the long sequences 
presented in TABLE 1, conditions are adjusted to account for the above-described nature of 
promoters. For example short probes, constituting the element sought, are preferably used under 
low temperature and/or high salt conditions. When long probes, which might include several 
promoter elements are used, low to medium stringency conditions are preferred when 
hybridizing to promoters across species. 

If a nucleotide sequence of an SDF, or part of the SDF, functions as a promoter or 
fragment of a promoter, then nucleotide substitutions, insertions or deletions that do not 
substantially affect the binding of relevant DNA binding proteins would be considered 
equivalent to the exemplified nucleotide sequence. It is envisioned that there are instances 
where it is desirable to decrease the binding of relevant DNA binding proteins to silence or 
down-regulate a promoter, or conversely to increase the binding of relevant DNA binding 
proteins to enhance or up-regulate a promoter and vice versa. In such instances, 
polynucleotides representing changes to the nucleotide sequence of the DNA-protein contact 
region by insertion of additional nucleotides, changes to identity of relevant nucleotides, 
including use of chemically-modified bases, or deletion of one or more nucleotides are 
considered encompassed by the present invention. In addition, fragments of the promoter 
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sequences described by Table 1 and variants thereof can be fused with other promoters or 
fragments to facilitate transcription and/or transcription in specific type of cells or under 
specific conditions. 

Pronaoter function can be assayed by methods known in the art, preferably by 
measuring activity of a reporter gene operatively linked to the sequence being tested for 
promoter function. Examples of reporter genes include those encoding luciferase, green 
fluorescent protein, GUS, neo, cat and bar. 

I.F. UTRs and Junctions 

Polynucleotides comprising untranslated (UTR) sequences and intron/exon junctions are 
also within the scope of the invention. UTR sequences include introns and 5' or 3' untranslated 
regions ( 5' UTRs or 3' UTRs). Fragments of the sequences shown in TABLE 1 can comprise 
UTRs and intron/exon junctions. 

These fragments of SDFs, especially UTRs, can have regulatory functions related to, for 
example, translation rate and mRNA stability. Thus, these fragments of SDFs can be isolated 
for use as elements of gene constructs for regulated production of polynucleotides encoding 
desired polypeptides. 

Introns of genomic DNA segments might also have regulatory functions. Sometimes 
regulatory elements, especially transcription enhancer or suppressor elements, are found 
within introns. Also, elements related to stability of heteronuclear RNA and efficiency of 
splicing and of transport to the cytoplasm for translation can be found in intron elements. 
Thus, these segments can also find use as elements of expression vectors intended for use to 
transform plants. 

Just as with promoters UTR sequences and intron/exon junctions can vary from those 
shown in TABLE 1. Such changes from those sequences preferably will not affect the 
regulatory activity of the UTRs or intron/exon junction sequences on expression, 
transcription, or translation unless selected to do so. However, in some instances, down- or 
up-regulation of such activity may be desired to modulate traits or phenotypic or in vitro 
activity. 



I.G. Coding Sequences 



Reference No. 2750-942P 



49 

Isolated polynucleotides of the invention can include coding sequences that encode 
polypeptides comprising an amino acid sequence encoded by sequences in TABLE 1 or an 
amino acid sequence presented in TABLE 1. 

A nucleotide sequence encodes a polypeptide if a cell (or a cell free in vitro system) 
expressing that nucleotide sequence produces a polypeptide having the recited amino acid 
sequence when the nucleotide sequence is transcribed and the primary transcript is 
subsequently processed and translated by a host cell (or a cell free in vitro system) harboring 
the nucleic acid. Thus, an isolated nucleic acid that encodes a particular amino acid sequence 
can be a genomic sequence comprising exons and introns or a cDNA sequence that represents 
the product of splicing thereof. An isolated nucleic acid encoding an amino acid sequence 
also encompasses heteronuclear RNA, which contains sequences that are spliced out during 
expression, and mRNA, which lacks those sequences. 

Coding sequences can be constructed using chemical synthesis techniques or by 
isolating coding sequences or by modifying such synthesized or isolated coding sequences as 
described above. 

In addition to coding sequences encoding the polypeptide sequences of TABLE 1, 
which are native to corn, Arabidopsis, soybean, rice, wheat, and other plants the isolated 
polynucleotides can be polynucleotides that encode variants, fragments, and fusions of those 
native proteins. Such polypeptides are described below in part II. 

In variant polynucleotides generally, the number of substitutions, deletions or insertions 
is preferably less than 20%, more preferably less than 15%; even more preferably less than 10%, 
5%, 3% or 1% of the number of nucleotides comprising a particularly exemplified sequence. It 
is generally expected that non-degenerate nucleotide sequence changes that result in 1 to 10, 
more preferably 1 to 5 and most preferably 1 to 3 amino acid insertions, deletions or 
substitutions will not greatly affect the function of an encoded polypeptide. The most preferred 
embodiments are those wherein 1 to 20, preferably 1 to 10, most preferably 1 to 5 nucleotides 
are added to, deleted from and/or substituted in the sequences specifically disclosed in TABLE 
1. 

Insertions or deletions in polynucleotides intended to be used for encoding a polypeptide 
preferably preserve the reading frame. This consideration is not so important in instances when 
the polynucleotide is intended to be used as a hybridization probe. 
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II. Polypeptides and Proteins 

IIA. Native polypeptides and proteins 

Polypeptides within the scope of the invention include both native proteins as well as 
variants, fragments, and fusions thereof. Polypeptides of the invention are those encoded by 
any of the six reading frames of sequences shown in TABLE 1, preferably encoded by the 
three frames reading in the 5' to 3' direction of the sequences as shown. 

Native polypeptides include the proteins encoded by the sequences shown in TABLE 
1. Such native polypeptides include those encoded by allelic variants. 

Polypeptide and protein variants will exhibit at least 75% sequence identity to those 
native polypeptides of TABLE 1. More preferably, the polypeptide variants will exhibit at least 
85% sequence identity; even more preferably, at least 90% sequence identity; more preferably at 
least 95%, 96%, 97%, 98%, or 99% sequence identity. Fragments of polypeptide or fragments 
of polypeptides will exhibit similar percentages of sequence identity to the relevant fragments 
of the native polypeptide. Fusions will exhibit a similar percentage of sequence identity in that 
fragment of the fusion represented by the variant of the native peptide. 

Furthermore, polypeptide variants will exhibit at least one of the functional properties of 
the native protein. Such properties include, without limitation, protein interaction, DNA 
interaction, biological activity, immunological activity, receptor binding, signal transduction, 
transcription activity, growth factor activity, secondary structure, three-dimensional structure, 
etc. As to properties related to in vitro or in vivo activities, the variants preferably exhibit at least 
60% of the activity of the native protein; more preferably at least 70%, even more preferably at 
least 80%, 85%, 90% or 95% of at least one activity of the native protein. 

One type of variant of native polypeptides comprises amino acid substitutions, deletions 
and/or insertions. Conservative substitutions are preferred to maintain the function or activity of 
the polypeptide. 

Within the scope of percentage of sequence identity described above, a polypeptide of 
the invention may have additional individual amino acids or amino acid sequences inserted into 
the polypeptide in the middle thereof and/or at the N-terminal and/or C-terminal ends thereof. 
Likewise, some of the amino acids or amino acid sequences may be deleted from the 
polypeptide. 
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A.l Antibodies 

Isolated polypeptides can be utilized to produce antibodies. Polypeptides of the invention 
can generally be used, for example, as antigens for raising antibodies by known techniques. The 
resulting antibodies are useful as reagents for determining the distribution of the antigen protein 
within the tissues of a plant or within a cell of a plant. The antibodies are also useful for 
examining the production level of proteins in various tissues, for example in a wild-type plant or 
following genetic manipulation of a plant, by methods such as Western blotting. 

Antibodies of the present invention, both polyclonal and monoclonal, may be prepared 
by conventional methods. In general, the polypeptides of the invention are first used to 
immunize a suitable animal, such as a mouse, rat, rabbit, or goat. Rabbits and goats are 
preferred for the preparation of polyclonal sera due to the volume of serum obtainable, and the 
availability of labeled anti-rabbit and anti-goat antibodies as detection reagents. Immunization is 
generally performed by mixing or emulsifying the protein in saline, preferably in an adjuvant 
such as Freund's complete adjuvant, and injecting the mixture or emulsion parenterally 
(generally subcutaneously or intramuscularly). A dose of 50-200 fig/injection is typically 
sufficient. Immunization is generally boosted 2-6 weeks later with one or more injections of the 
protein in saline, preferably using Freund's incomplete adjuvant. One may alternatively 
generate antibodies by in vitro immunization using methods known in the art, which for the 
purposes of this invention is considered equivalent to in vivo immunization. 

Polyclonal antisera is obtained by bleeding the immunized animal into a glass or plastic 
container, incubating the blood at 25°C for one hour, followed by incubating the blood at 4°C 
for 2-18 hours. The serum is recovered by centrifugation (e.g., l,OGOxg for 10 minutes). About 
20-50 ml per bleed may be obtained from rabbits. 

Monoclonal antibodies are prepared using the method of Kohler and Milstein, Nature 
256: 495 (1975), or modification thereof. Typically, a mouse or rat is immunized as described 
above. However, rather than bleeding the animal to extract serum, the spleen (and optionally 
several large lymph nodes) is removed and dissociated into single cells. If desired, the spleen 
cells can be screened (after removal of nonspecifically adherent cells) by applying a cell 
suspension to a plate, or well, coated with the protein antigen. B-cells producing membrane- 
bound immunoglobulin specific for the antigen bind to the plate, and are not rinsed away with 
the rest of the suspension. Resulting B-cells, or all dissociated spleen cells, are then induced to 
fuse with myeloma cells to form hybridomas, and are cultured in a selective medium (e.g., 
hypoxanthine, aminopterin, thymidine medium, "HAT"). The resulting hybridomas are plated 
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by limiting dilution, and are assayed for the production of antibodies which bind specifically to 
the immunizing antigen (and which do not bind to unrelated antigens). The selected Mab- 
secreting hybridomas are then cultured either in vitro {e.g., in tissue culture bottles or hollow 
fiber reactors), or in vivo (as ascites in mice). 

Other methods for sustaining antibody-producing B-cell clones, such as by EBV 
transformation, are known. 

If desired, the antibodies (whether polyclonal or monoclonal) may be labeled using 
conventional techniques. Suitable labels include fluorophores, chromophores, radioactive atoms 
(particularly "'^P and ^^^I), electron-dense reagents, enzymes, and ligands having specific binding 
partners. Enzymes are typically detected by their activity. For example, horseradish peroxidase 
is usually detected by its ability to convert 3,3',5,5'-tetramethylbenzidine (TNB) to a blue 
pigment, quantifiable with a spectrophotometer. 

A.2 In Vitro Applications of Polypeptides 

Some polypeptides of the invention will have enzymatic activities that are useful in vitro. 
For example, the soybean trypsin inhibitor (Kunitz) family is one of the numerous families of 
proteinase inhibitors. It comprises plant proteins which have inhibitory activity against serine 
proteinases from the trypsin and subtilisin families, thiol proteinases and aspartic proteinases. 
Thus, these peptides find in vitro use in protein purification protocols and perhaps in 
therapeutic settings requiring topical application of protease inhibitors. 

Delta-aminolevulinic acid dehydratase (EC 4.2.1.24) (ALAD) catalyzes the second 
step in the biosynthesis of heme, the condensation of two molecules of 5-aminolevulinate to 
form porphobilinogen and is also involved in chlorophyll biosynthesis(Kaczor et al. (1994) 
Plant Physiol. 1-4: 1411-7; Smith (1988) Biochem. J. 249: 423-8; Schneider (1976) Z. 
naturforsch. [C] 31: 55-63). Thus, ALAD proteins can be used as catalysts in synthesis of 
heme derivatives. Enzymes of biosynthetic pathways generally can be used as catalysts for in 
vitro synthesis of the compounds representing products of the pathway. 

Polypeptides encoded by SDFs of the invention can be engineered to provide 
purification reagents to identify and purify additional polypeptides that bind to them. This 
allows one to identify proteins that function as multimers or elucidate signal transduction or 
metabolic pathways. In the case of DNA binding proteins, the polypeptide can be used in a 
similar manner to identify the DNA determinants of specific binding (S. Pierrou et a\.,Anal. 
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Biochem. 229:99 (1995), S. Chusacultanachai et al, J. Biol. Chem. 274:23591 (1999), Q. Lin 
et al., J. Biol. Chem. 272:27274 (1997)). 

ILB . POLYPEPTIDE VARIANTS , FRAGMENTS, AND FUSIONS 

Generally, variants , fragments, or fusions of the polypeptides encoded by the SDFs of 

the invention can exhibit at least one of the activities of the identified domains and/or related 

polypeptides described in Table 1 corresponding to the SDF of interest. 

ILB .(1) Variants 

A type of variant of the native polypeptides comprises amino acid substitutions. 
Conservative substitutions, described above (see II.), are preferred to maintain the function or 
activity of the polypeptide. Such substitutions include conservation of charge, polarity, 
hydrophobicity, size, etc. For example, one or more amino acid residues within the sequence 
can be substituted with another amino acid of similar polarity that acts as a functional 
equivalent, for example providing a hydrogen bond in an enzymatic catalysis. Substitutes for an 
amino acid within an exemplified sequence are preferably made among the members of the class 
to which the amino acid belongs. For example, the nonpolar (hydrophobic) amino acids include 
alanine, leucine, isoleucine, valine, proline, phenylalanine, tryptophan and methionine. The 
polar neutral amino acids include glycine, serine, threonine, cysteine, tyrosine, asparagine, and 
glutamine. The positively charged (basic) amino acids include arginine, lysine and histidine. 
The negatively charged (acidic) amino acids include aspartic acid and glutamic acid. 

Within the scope of percentage of sequence identity described above, a polypeptide of 
the invention may have additional individual amino acids or amino acid sequences inserted into 
the polypeptide in the middle thereof and/or at the N-terminal and/or C-terminal ends thereof. 
Likewise, some of the amino acids or amino acid sequences may be deleted from the 
polypeptide. 7\mino acid substitutions may also be made in the sequences; conservative 
substitutions being preferred. 

One preferred class of variants are those that comprise (1) the domain of an 
encoded polypeptide and/or (2) residues conserved between the encoded polypeptide and 
related polypeptides. For this class of variants, the encoded polypeptide sequence is changed 
by insertion, deletion, or substitution at positions flanking the domain and/or conserved 
residues. 
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Another class of variants includes those that comprise an encoded polypeptide 
sequence that is changed in the domain or conserved residues by a conservative substitution. 

Yet another class of variants includes those that lack one of the in vitro 
activities, or structural features of the encoded polypeptides. One example is polypeptides or 
proteins produced from genes comprising dominant negative mutations. Such a variant may 
comprise an encoded polypeptide sequence with non-conservative changes in a particular 
domain or group of conserved residues. 

II. A. (2) FRAGMENTS 

Fragments of particular interest are those that comprise a domain identified for a 
polypeptide encoded by an SDF of the instant invention and variants thereof. Also, fragments 
that comprise at least one region of residues conserved between an SDF encoded polypeptide 
and its related polypeptides are of great interest. Fragments are sometimes useful as 
polypeptides corresponding to genes comprising dominant negative mutations are. 

II.A.(3)FUSIONS 

Of interest are chimeras comprising (1) a fragment of the SDF encoded 
polypeptide or variants thereof of interest and (2) a fragment of a polypeptide comprising the 
same domain. For example, an AP2 helix encoded by a SDF of the invention fused to second 
AP2 helix from ANT protein, which comprises two AP2 helices. The present invention also 
encompasses fusions of SDF encoded polypeptides, variants, or fragments thereof fused with 
related proteins or fragments thereof. 
DEFINITION OF DOMAINS 

The polypeptides of the invention may possess identifying domains. In addition, the 
domains within the SDF encoded polypeptide can be defined by the region that exhibits at 
least 70% sequence identity with the consensus sequences listed in the detailed description 
below of each of the domains. 

The majority of the protein domain descriptions given below are obtained from 

Prosite, 

(httpZ/www.expasy.ch/prosite/), and Pfam, 
(http//pfam.wustl.edu/browse.shtml). 



Reference No. 2750-942P 



55 

1. (AAA) AAA-protein family signature 

A large family of ATPases has been described [1 to 5] whose key feature is 
that they share a conserved region of about 220 amino acids that contains anATP- 
binding site. This family is now called AAA, for ATPases A'ssociated with diverse 
5 cellular 'Activities. The proteins that belong to this family either contain one or two 

AAA domains. Proteins containing two AAA domains: 
Mammalian and drosophila NSF (N-ethylmaleimide-sensitive fusion protein) and the 
fungal homolog, SEC18. These proteins are involved in intracellular transport between 
the endoplasmic reticulum and Golgi, as well as between different Golgi cisternae. 
1 0 - Mammalian transitional endoplasmic reticulum ATPase (previously known as p97 or 

VCP) which is involved in the transfer of membranes from the endoplasmic reticulum to 
the golgi apparatus. This protein forms a ring-shaped homooligomer composed of six 
subunits. The yeast homolog is CDC48 and it may play a role in spindle pole 
proliferation. 

15 - Yeast protein PASl, essential for peroxisome assembly and the related protein PASl 
from Pichia pastoris. 

- Yeast protein AFG2. 

Sulfolobus acidocaldarius protein SAV and Halobacterium salinarium cdcH which may 
be part of a transduction pathway connecting light to cell division. 

2 0 Proteins containing a single AAA domain: 

Escherichia coli and other bacteria ftsH (or hflB) protein. FtsH is an ATP -dependent zinc 
metallopeptidase that seems to degrade the heat-shock sigma-32 factor. 
It is an integral membrane protein with a large cytoplasmic C-terminal domain that 
contain both the AAA and the protease domains. 
25 - Yeast protein YMEl, a protein important for maintaining the integrity of the 
mitochondrial compartment. YMEl is also a zinc-dependent protease. 

- Yeast protein AFG3 (or YTAIO). This protein also seems to contain a AAA domain 
followed by a zinc-dependent protease domain. 

Subunits from the regulatory complex of the 26S proteasome [6] which is involved in 

3 0 the ATP-dependent degradation of ubiquitinated proteins: 

a) Mammalian subunit 4 and homologs in other higher eukaryotes, in yeast (gene 
YTA5) and fission yeast (gene mts2). 
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b) Mammalian subunit 6 (TBP7) and homologs in other higiier eukaryotes and in 
yeast (gene YTA2). 

c) Mammalian subunit 7 (MSSl) and homologs in other higher eukaryotes and in 
yeast (gene CIM5 or YTA3). 

d) Mammalian subunit 8 (P45) and homologs in other higher eukaryotes and in 
yeast (SUGl or CIM3 or TBYl) and fission yeast (gene letl). 

Other probable subunits such as human TBPl which seems to influences HIV gene 
expression by interacting with the virus tat transactivator protein and yeast YTAl and YTA6. 

- Yeast protein BCSl, a mitochondrial protein essential for the expression of the 
Rieske iron-sulfur protein. 

Yeast protein MSPl, a protein involved in intramitochondrial sorting of proteins. 
Yeast protein PASS, and the corresponding proteins PASS from Pichia pastoris 
and PAY4 from Yarrowia lipolytica. 

Mouse protein SKDl and its fission yeast homolog (SpAC2G11.06). 
Caenorhabditis elegans meiotic spindle formation protein mei-1. 
Yeast protein SAPl. 

- Yeast protein YTA7. 

Mycobacterium leprae hypothetical protein A2126A. 
It is proposed that, in general, the AAA domains in these proteins act as ATP- 
dependent protein clamps [5]. In addition to the ATP -binding 'A' and 'B' motifs, which are 
located in the N-terminal half of this domain, there is a highly conserved region located in the 
central part of the domain which was used to develop a signature pattern. 

Consensus pattern: [LIVMT]-x-[LIVMT]-[LIVMF]-x-[GATMC]-[ST]-[NS]-x(4)-[LIVM]- 
D-x-A-[LIFA]-x-R 

[1] Froehlich K.-U., Fries H.W., Ruediger M., Erdmann R., Botstein D., Mecke D. J. Ceil 
Biol. 114:443-453(1991). 

[2] Erdmann R., Wiebel F.F., Flessau A., Rytka J., Beyer A., Froehlich K.-U., Kunau W.-H. 
Cell 64:499-510(1991). 

[3] Peters J.-M., Walsh M.J., Franke W.W. EMBO J. 9:1757-1767(1990). 

[4] Kunau W.-H., Beyer A., Goette K., Marzioch M., Saidowsky J., Skaletz-Rorowski A., 

Wiebel F.F. Biochimie 75:209-224(1993). 
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[5] Confalonieri F., Duguet M. BioEssays 17:639-650(1995).[ 6] Hilt W., Wolf D.H. Trends 
Biochem. Sci. 21:96-102(1996). 

2. ABC Membrane (ABC transporter transmembrane region). This family represents a unit of 
six transmembrane helices. Many members of the ABC transporter family (ABC_tran)have 
two such regions. See also descriptions of ABC Tran, below, and ABC2 membrane, above. 

3. (ABC Tran) ABC transporters family signature. On the basis of sequence similarities a 
family of related ATP-bindingproteins has been characterized [1 to 5]. These proteins are 
associated with avariety of distinct biological processes in both prokaryotes and eukaryotes, 
but a majority of them are involved in active transport of small hydrophilic molecules across 
the cytoplasmic membrane. All these proteins share a conserved domain of some two 
hundred amino acid residues, which includes an ATP -binding site. These proteins are 
collectively known as ABC transporters. Proteins known to belong to this family are listed 
below (references are only provided for recently determined sequences).In prokaryotes: - 
Active transport systems components: alkylphosphonate uptake(phnC/phnK/ phnL); 
arabinose (araG); arginine (artP); dipeptide (dciAD;dppD/dppF); ferric enterobactin (fepC); 
ferrichrome (fhuC); galactoside (mglA); glutamine (glnQ); glycerol-3-phosphate (ugpC); 
glycine betaine/L-proline (proV); glutamate/aspatate (gltL); histidine (hisP); ii-on(III) (sfuC), 
iron(III) dicitrate (fecE); lactose (lacK); leucine/isoleucine/valine (braF/braG;livF/livG); 
maltose (malK); molybdenum (modC); nickel (nikD/ nikE); oligopeptide 
(amiE/amiF:oppD/oppF); peptide (sapD/sapF); phosphate (pstB); putrescine (potG); ribose 
(rbsA); spermidine/putrescine (potA); sulfate (cysA); vitamin B12 (btuD). - 
Hemolysin/leukotoxin export proteins hlyB, cyaB and IktB. - Colicin V export protein cvaB. 
- Lactococcin export protein IcnC [6]. - Lantibiotic transport proteins nisT (nisin) and spaT 
(subtilin). - Extracellular proteases B and C export protein prtD. - Alkaline protease secretion 
protein aprD. - Beta-(l,2)-glucan export proteins chvA and ndvA. - Haemophilus influenzae 
capsule-polysaccharide export protein bexA. - Cytochrome c biogenesis proteins ccmA (also 
known as cycV and helA). - Polysialic acid transport protein kpsT. - Cell division associated 
ftsE protein (function unknown). - Copper processing protein nosF from Pseudomonas 
stutzeri. - Nodulation protein nodi from Rhizobium (function unknown). - Escherichia coli 
proteins cydC and cydD. - Subunit A of the ABC excision nuclease (gene uvrA). - 
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Erythromycin resistance protein from Staphylococcus epidermidis (gene msrA). - Tylosin 
resistance protein from Streptomyces fradiae (gene tlrC) [7]. - Heterocyst differentiation 
protein (gene hetA) from Anabaena PCC 7120. - Protein P29 from Mycoplasma hyorhinis, a 
probable component of a high affinity transport system. - yhbG, a putative protein whose 
gene is linked with ntrA in many bacteria such as Escherichia coli, Klebsiella pneumoniae, 
Pseudomonas putida, Rhizobium meliloti and Thiobacillus ferrooxidans. - Escherichia coli 
and related bacteria hypothetical proteins yabJ, yadG, yagC, ybbA, ycjW, yddA, yehX, yejF, 
yheS, yhiG, yhiH, yjcW, yjjK, yojl, yrbF and ytfR.In eukaryotes: - The multidrug 
transporters (Mdr) (P-glycoprotein), a family of closely related proteins which extrude a wide 
variety of drugs out of the cell (for a review see [8]). - Cystic fibrosis transmembrane 
conductance regulator (CFTR), which is most probably involved in the transport of chloride 
ions. - Antigen peptide transporters 1 (TAPl, PSFl, RING4, HAM-1, mtpl) and 2 (TAP2, 
PSF2, RINGll, HAM-2, mtp2), which are involved in the transport of antigens from the 
cytoplasm to a membrane-bound compartment for association with MHC class I molecules. - 
70 Kd peroxisomal membrane protein (PMP70). - ALDP, a peroxisomal protein involved in 
X-linked adrenoleukodystrophy [9]. - Sulfonylurea receptor [10], a putative subunit of the B- 
cell ATP-sensitive potassium channel. - Drosophila proteins white (w) and brown (bw), 
which are involved in the import of ommatidium screening pigments. - Fungal elongation 
factor 3 (EF-3). - Yeast STE6 which is responsible for the export of the a-factor pheromone. - 
Yeast mitochondrial transporter ATMl. - Yeast MDLl and MDL2. - Yeast SNQ2. - Yeast 
sporidesmin resistance protein (gene PDR5 or STSl or YDRl). - Fission yeast heavy metal 
tolerance protein hmtl. This protein is probably involved in the transport of metal-bound 
phytochelatins. - Fission yeast brefeldin A resistance protein (gene bfrl or hba2). - Fission 
yeast leptomycin B resistance protein (gene pmdl). - mbpX, a hypothetical chloroplast 
protein from Liverwort. - Prestalk-specific protein tagB from slime mold. This protein 
consists of two domains: a N-terminal subtilase catalytic domain and a C-terminal ABC 
transporter domain.As a signature pattern for this class of proteins, a conserved region which 
is located between the 'A and the 'B' motifs of the ATP-binding site was used. 

Consensus pattern: [LIVMFYC]-[SA]-[SAPGLVFYKQH]-G-[DENQMW]- 
[KRQASPCLIMFW]-[KRNQSTAVM]-[KRACLVM]-[LIVMFYPAN]-{PHY}-[LIVMFW]- 
[SAGCLIVP]-{FYWHP}-{KRHP}-[LIVMFYWSTA] The ATP-binding region is 
duplicated in araG, mdl, msrA, rbsA, tlrC, uvrA, yejF, Mdr's, CFTR, pmdl and in EF-3. In 
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some of those proteins, the above pattern only detect one of the two copies of the domain. 
The proteins belonging to this family also contain one or two copies of the ATP-binding 
motifs 'A' and 'B'. 

[ 1] Higgins C.F., Hyde S.C., Mimmack M.M., Gileadi U., Gill D.R., Gallagher M.P. J. 
Bioenerg. Biomembr. 22:571-592(1990). 

[ 2] Higgins C.F., Gallagher M.P., Mimmack M.M., Pearce S.R. BioEssays 8:111-116(1988). 

[ 3] Higgins C.F., Hiles I.D., Salmond G.P.C., Gill D.R., Downie J.A., Evans I.J., Holland 

I.B., Gray L., Buckets S.D., Bell A.W., Hermodson M.A. Nature 323:448-450(1986). 

[ 4] Doolittle R.F., Johnson M.S., Husain I., van Houten B., Thomas D.C., Sancar A. Nature 

323:451-453(1986). 

[ 5] Blight M.A., Holland LB. Mol. Microbiol. 4:873-880(1990). 

[ 6] Stoddard G.W., Petzel J.P., van Belkum M.J., Kok J., McKay L.L. Appl. Environ. 

Microbiol. 58:1952-1961(1992). 

[ 7] Rosteck P.R. Jr., Reynolds P.A., Hershberger C.L. Gene 102:27-32(1991). 
[ 8] Gottesman M.M., Pastan I. J. Biol. Chem. 263:12163-12166(1988). 
[ 9] Valle D., Gaertner J. Nature 361:682-683(1993). 

[10] Aguilar-Bryan L., Nichols C.G., Wechsler S.W., Clement J.P. IV, Boyd A.E. Ill, 
Gonzalez G., Herrera-Sosa H., Nguy K., Bryan J., Nelson D.A. Science 268:423-426(1995). 

4. (ACBP) 

Acyl-CoA-binding protein signature 

Acyl-CoA-binding protein (ACBP) is a small (10 Kd) protein that binds medium- and long- 
chain acyl-CoA esters with very high affinity and may function as an intracellular carrier of 
acyl-CoA esters [1]. ACBP is also known as diazepam binding inhibitor (DBI) or endozepine 
(EP) because of its ability to displace diazepam from the benzodiazepine (BZD) recognition 
site located on the GABA type A receptor. It is therefore possible that this protein also acts as 
a neuropeptide to modulate the action of the GABA receptor [2] .ACBP is a highly conserved 
protein of about 90 residues that has been so far found in vertebrates, insects and yeast. 
ACBP is also related to the N-terminal section of a probable transmembrane protein of 
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unknown function whichhas been found in mammals. As a signature pattern, the region that 
corresponds to residues 19 to 37 in mammalian ACBP was selected. 

Consensus pattern: P-[STA]-x-[DEN]-x-[LIVMF]-x(2)-[LIVMFY]-Y-[GSTA]-x-[FY]-K- Q- 
[STA](2)-x-G- 

[ 1] Rose T.M., Schultz E.R., Todaro G.J. Proc. Natl. Acad. Sci. U.S.A. 89:11287- 
11291(1992). 

[ 2] Costa E., Guidotti A. Life Sci. 49:325-344(1991). 

5. (AIRS) 

AIR synthase related proteins 

This family includes Hydrogen expression/formation protein HypE, AIR synthases, FGAM 
synthase and selenide, water dikinase. 

6. (AMP-binding) 

Putative AMP-binding domain signature 

It has been shown [1 to 5] that a number of prokaryotic and eukaryotic enzymes which all 
probably act via an ATP-dependent covalent binding of AMP to their substrate, share a 
region of sequence similarity. These enzymes are: - Insects luciferase (luciferin 4- 
monooxygenase). Luciferase produces light by catalyzing the oxidation of luciferin in 
presence of ATP and molecular oxygen. - Alpha-aminoadipate reductase from yeast (gene 
LYS2). This enzyme catalyzes the activation of alpha-aminoadipate by ATP-dependent 
adenylation and the reduction of activated alpha-aminoadipate by NADPH. - Acetate-CoA 
ligase (acetyl-CoA synthetase), an enzyme that catalyzes the formation of acetyl-CoA from 
acetate and CoA. - Long-chain-fatty-acid-CoA ligase, an enzyme that activates long-chain 
fatty acids for both the synthesis of cellular lipids and their degradation via beta-oxidation. - 
4-coumarate-CoA ligase (4CL), a plant enzyme that catalyzes the formation of 4- 
coumarate-CoA from 4-coumarate and coenzyme A; the branchpoint reactions between 
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general phenylpropanoid metabolism and pathways leading to various specific end products. - 
O-succinylbenzoic acid--CoA ligase (OSB-CoA synthetase) (gene menE) [6], a bacterial 
enzyme involved in the biosynthesis of menaquinone (vitamin K2). - 4-Chlorobenzoate-CoA 
ligase (EC 6.2.1.-) (4-CBA--CoA ligase) [7], a Pseudomonas enzyme involved in the 
degradation of 4-CBA. - Indoleacetate-lysine ligase (lAA-lysine synthetase) [8], an enzyme 
from Pseudomonas syringae that converts indoleacetate to lAA-lysine. - Bile acid-CoA ligase 
(gene baiB) from Eubacterium strain VP! 12708 [4]. This enzyme catalyzes the ATP- 
dependent formation of a variety of C-24 bile acid-CoA. - Crotonobetaine/carnitine-CoA 
ligase (EC 6.3.2.-) from Escherichia coli (gene caiC). - L-(alpha-aminoadipyl)-L-cysteinyl-D- 
valine synthetase (ACV synthetase) from various fungi (gene acvA or pcbAB). This enzyme 
catalyzes the first step in the biosynthesis of penicillin and cephalosporin, the formation of 
ACV from the constituent amino acids. The amino acids seem to be activated by adenylation. 
It is a protein of around 3700 amino acids that contains three related domains of about 1000 
amino acids. - Gramicidin S synthetase I (gene grsA) from Bacillus brevis. This enzyme 
catalyzes the first step in the biosynthesis of the cyclic antibiotic gramicidin S, the ATP- 
dependent racemization of phenylalanine - Tyrocidine synthetase I (gene tycA) from 
Bacillus brevis. The reaction carried out by tycA is identical to that catalyzed by grsA - 
Gramicidin S synthetase II (gene grsB) from Bacillus brevis. This enzyme is a 
multifunctional protein that activates and polymerizes proline, valine, ornithine and leucine. 
GrsB consists of four related domains. - Enterobactin synthetase components E (gene entE) 
and F (gene entP) from Escherichia coli. These two enzymes are involved in the ATP- 
dependent activation of respectively 2,3-dihydroxybenzoate and serine during enterobactin 
(enterochelin) biosynthesis. - Cyclic peptide antibiotic surfactin synthase subunits 1, 2 and 3 
from Bacillus subtilis. Subunits 1 and 2 contains three related domains while subunit 3 only 
contains a single domain. - HC-toxin synthetase (gene HTSl) from Cochliobolus carbonum. 
This enzyme activates the four amino acids (Pro, L-Ala, D-Ala and 2-amino-9,10-epoxi-8- 
oxodecanoic acid) that make up HC-toxin, a cyclic tetrapeptide. HTSl consists of four related 
domains.There are also some proteins, whose exact function is not yet known, but whichare, 
very probably, also AMP-binding enzymes. These proteins are: - ORA (octapeptide-repeat 
antigen), a Plasmodium falciparum protein whose function is not known but which shows a 
high degree of similarity with the above proteins. - AngR, a Vibrio anguillarum protein. 
AngR is thought to be a transcriptional activator which modulates the anguibactin (an iron- 
binding siderophore) biosynthesis gene cluster operon. But it is believed [9], that angR is not 
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a DNA-binding protein, but rather an enzyme involved in the biosynthesis of anguibactin. 
This conclusion is based on three facts: the presence of the AMP -binding domain; the size of 
angR (1048 residues), vv^hich is far bigger than any bacterial transcriptional protein; and the 
presence of a probable S-acyl thioesterase immediately dovv^nstream of angR. - A 
hypothetical protein in mmsB 3'region in Pseudomonas aeruginosa. - Escherichia coli 
hypothetical protein ydiD. - Yeast hypothetical protein YBR041w. - Yeast hypothetical 
protein YBR222c. - Yeast hypothetical protein YER147c.All these proteins contain a highly 
conserved region very rich in glycine, serine, and threonine v*/hich is followed by a conserved 
lysine. A parallel can be drawn between this type of domain and the G-x(4)-G-K-[ST] ATP- 
/GTP-binding 'P-loop domain or the protein kinases G-x-G-x(2)-[SG]-x(10,20)-KATP- 
binding domains. 

Consensus pattern: [LIVMFY]-x(2)-[STG]-[STAG]-G-[ST]-[STEI]-[SG]-x-[PASLIVM]- 
[KR] In a majority of cases the residue that follows the Lys at the end of the pattern is a Gly. 

[ 1] Toh H. Protein Seq. Data Anal. 4:111-117(1991). 

[ 2] Smith D.J., Earl A.J., Turner G. EMBO J. 9:2743-2750(1990). 

[ 3] Schroeder J. Nucleic Acids Res. 17:460-460(1989). 

[ 4] Mallonee D.H., Adams J.L., Hylemon P.B. J. Bacteriol. 174:2065-2071(1992). 
[ 5] Turgay K., Krause M., Marahiel M.A. Mol. Microbiol. 6:529-546(1992). 
[ 6] Driscoll J.R., Taber H.W. J. Bacteriol. 174:5063-5071(1992). 

[ 7] Babbitt P.C., Kenyon G.L., Matin B.M., Charest H., Sylvestre M., Scholten J.D., Chang 
K.-H., Liang P.-H., Dunaway-Mariano D. Biochemistry 31:5594-5604(1992). 
[ 8] Farrell D.H., Mikesell P., Actis L.A., Crosa J.H. Gene 86:45-51(1990). 

7. AP2 domain 

This 60 amino acid residue domain can bind to DNA [1]. This domain is plant specific. 
Members of this family are suggested to be related to pyridoxal phosphate -binding domains 
such as found in aminotran 2 [3]. AP2 domains are also described in Jofuku et al., co- 
pending U.S. Patent applications 08/700,152, 08/879,827, 08/912,272, 09/026,039. 
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[1] Ohme-takagi M, Shinshi H; Plant Cell 1995;7:173-182. 

[2] Weigel D; Plant Cell 1995;7:388-389. 

[3] Mushegian AR, Koonin EV; Genetics 1996;144:817-828. 

5 

8. ARID 

The ARID domain is an AT-Rich Interaction domain sharing structural homology to DNA 
replication and repair nucleases and polymerases. 

10 [1] Herrscher RF, Kaplan MH, Lelsz DL, Das C, Scheuermann R, Tucker PW; Genes Dev 
1995;9:3067-3082. 

[2] Yuan YC, Whitson RH, Liu Q, Itakura K, Chen Y; Nat Struct Biol 1998;5:959-964. 

15 9. (ATPsynt) 

ATP synthase gamma subunit signature 

ATP synthase (proton-translocating ATPase) (EC 3.6.1.34 ) [1,2] is a componentof the 
cytoplasmic membrane of eubacteria, the inner membrane of mitochondria, and the thylakoid 

2 0 membrane of chloroplasts. The ATPase complex is composed of an oligomeric 

transmembrane sector, called CF(0), and a catalytic core, called coupling factor CF(1). The 
former acts as a proton channel; the latter is composed of five subunits, alpha, beta, gamma, 
delta and epsilon. Subunit gamma is believed to be important in regulating ATPase activity 
and the flow of protons through the CF(0) complex. The best conserved region of the gamma 
25 subunit [3] is its C-terminus which seems to be essential for assembly and catalysis. As a 
signature pattern to detect ATPase gamma subunits, al4 residue conserved segment where 
the last amino acid is found one to three residues from the C-terminal extremity was used. 

Consensus pattern: [IV]-T-x-E-x(2)-[DE]-x(3)-G-A-x-[SAKR]- Note: Pea chloroplast gamma 

3 0 and two Bacillus species gamma subunits are not detected by this motif. 

[ 1] Futai M., Noumi T., Maeda M. Annu. Rev. Biochem. 58:111-136(1989). 
[ 2] Senior A.E. Physiol. Rev. 68:177-231(1988). 
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[ 3] Miki J., Maeda M., Mukohata Y., Futai M. FEES Lett. 232:221-226(1988). 

10. (ATP Synt A) 

5 Synthase a subunit signature 

ATP synthase (proton-translocating ATPase) (EC 3.6.1.34 ) [1,2] is a component of the 
cytoplasmic membrane of eubacteria, the inner membrane of mitochondria,and the thylakoid 
membrane of chloroplasts. The ATPase complex is composed of an oligomeric 

1 0 transmembrane sector, called CF(0), which acts as a proton channel, and a catalytic core, 

termed coupling factor CF(l).The CF(0) a subunit, also called protein 6, is a key component 
of the proton channel; it may play a direct role in translocating protons across the membrane. 
It is a highly hydrophobic protein that has been predicted to contain 8 transmembrane regions 
[3]. Sequence comparison of a subunits from all available sources reveals very few conserved 

15 regions. The best conserved region is located in what is predicted to be the fifth 

transmembrane domain. This region contains three perfectly conserved residues: an arginine, 
a leucine and an asparagine. Mutagenesis experiments of ATPase activity. This region was 
selected as a signature pattern. 

2 0 Consensus pattern: [STAGN]-x-[STAG]-[LIVMF]-R-L-x-[SAGV]-N-[LIVMT] [R is 

important for proton translocation] 

[ 1] Futai M., Noumi T., Maeda M. Annu. Rev. Biochem. 58:111-136(1989). 
[ 2] Senior A.E. Physiol. Rev. 68:177-231(1988). 
25 [ 3] Lewis M.L., Chang J.A., Simoni R.D. J. Biol. Chem. 265:10541-10550(1990). 
[ 4] Cain B.D., Simoni R.D. J. Biol. Chem. 264:3292-3300(1989). 

11. ATP synthase B 

3 0 Part of the CF(0) (base unit) of the ATP synthase. The base-unit is thought to translocate 

protons through membrane (inner membrane in mitochondria, thylakoid membrane in plants, 
cytoplasmic membrane in bacteria). The B subunits are thought to interact with the stalk of 
the CF(1) subunits. 
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12. (ATP synt C) 

ATP synthase c subunit signature 

5 

ATP synthase (proton-translocating ATPase) [1,2] is a component of the cytoplasmic 
membrane of eubacteria, the inner membrane of mitochondria, and the thylakoid membrane of 
chloroplasts. The ATPase complex is composed of an oligomeric transmembrane sector, 
called CF(0), which acts as a proton channel, and a catalytic core, termed coupling factor 

10 CF(l).The CF(0) c subunit (also called protein 9, proteolipid, or subunit 111) [3,4]is a highly 
hydrophobic protein of about 8 Kd which has been implicated in the proton-conducting 
activity of ATPase. Structurally subunit c consist of two long terminal hydrophobic regions, 
which probably span the membrane, and a central hydrophilic region. N,N'- 
dicyclohexylcarbodiimide (DCCD) can bind covalently to subunit c and thereby abolish the 

15 ATPase activity. DCCD binds to a specific glutamate or aspartate residue which is located in 
the middle ofthe second hydrophobic region near the C-terminus of the protein. A signature 
pattern which includes the DCCD-binding residue was derived. 

Consensus pattern: [GSTA]-R-[NQ]-P-x(10)-[LIVMFYW](2)-x(3)-[LIVMFYW]-x-[DE] [D 
2 0 or E binds DCCD] 

[ 1] Futai M., Noumi T., Maeda M. Annu. Rev. Biochem. 58:111-136(1989). 
[ 2] Senior A.E. Physiol. Rev. 68:177-231(1988). 

[ 3] Ivaschenko A.T., Karpenyuk T.A., Ponomarenko S.V. Biokhimiia 56:406-419(1991). 
2 5 [4] Recipon H., Perasso R., Adoutte A., Quetier F. J. Mol. Evol. 34:292-303(1992). 

13. (ATP synt DE) 

ATP synthase, Delta/Epsilon chain 

30 

Part of the ATP synthase CF(1). These subunits are part of the head unit of the ATP synthase. 
The subunits are called delta and epsilon in human and metozoan species but in bacterial 
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species the delta (D) subunit is theequivalent to the Oligomycin sensitive subunit (OS CP) in 
metozoans. 

5 14. (ATP synt ab) 

ATP synthase alpha and beta subunits signature 

ATP synthase (proton -translocating ATPase) [1,2] is a component of the cytoplasmic 
membrane of eubacteria, the inner membrane of mitochondria,and the thylakoid membrane of 

1 0 chloroplasts. The ATPase complex is composed of an oligomeric transmembrane sector, 

called CF(0), and a catalytic core, called coupling factor CF(1). The former acts as a proton 
channel; the latter is composed of five subunits, alpha, beta, gamma, delta and epsilon. The 
sequences of subunits alpha and beta are related and both contain a nucleotide -binding site 
for ATP and ADP. The beta chain has catalytic activity, while the alpha chain is a regulatory 

1 5 subunit. Vacuolar ATPases [3] (V-ATPases) are responsible for acidifying a variety of 
intracellular compartments in eukaryotic cells. Like F- ATPases, they are oligomeric 
complexes of a transmembrane and a catalytic sector. The sequenceof the largest subunit of 
the catalytic sector (70 Kd) is related to that ofF-ATPase beta subunit, while a 60 Kd subunit, 
from the same sector, is related to the F- ATPases alpha subunit [4].Archaebacterial 

2 0 membrane-associated ATPases are composed of three subunits. The alpha chain is related to 

F- ATPases beta chain and the beta chain is related to F-ATPases alpha chain [4] .A protein 
highly similar to F-ATPase beta subunits is found [5] in some bacterial apparatus involved in 
a specialized protein export pathway that proceeds without signal peptide cleavage. This 
protein is known as flil in Bacillus and Salmonella, Spa47 (mxiB) in Shigella flexneri, HrpB6 
25 in Xanthomonas campestris and yscN in Yersinia virulence plasmids.To detect these ATPase 
subunits, a segment of ten amino-acid residues, containing two conserved serines, as a 
signature pattern was selected. The first serine seems to be important for catalysis - in the 
ATPase alpha chain at least - as its mutagenesis causes catalytic impairment. 

3 0 Consensus pattern: P-[SAP]-[LIV]-[DNH]-x(3)-S-x-S [The first S is a putative active site 

residue] 

[ 1] Futai M., Noumi T., Maeda M. Annu. Rev. Biochem. 58:111-136(1989). 
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[ 2] Senior A.E. Physiol. Rev. 68:177-231(1988). 

[ 3] Nelson N. J. Bioenerg. Biomembr. 21:553-571(1989). 

[ 4] Gogarten J. P., Kibak H., Dittrich P., Taiz L., Bowman E.J., Bowman B.J., Manolson 
M.F., Poole R.J., Date T., Oshima T., Konishi J., Denda K., Yoshida M. Proc. Natl. Acad. 
5 Sci. U.S.A. 86:6661-665(1989). 

[ 5] Dreyfus G., Williams A.W., Kawagishi L, MacNab R.M. J. Bacteriol. 175:3131- 
3138(1993). 

10 15. (ATP synt ab C) 

ATP synthase ab C terminal. 

Number of members: 190 

15 [1] Abrahams JP, Leslie AG, Lutter R, Walker JE; "Structure at 2.8 A resolution of Fl- 
ATPase from bovine heart mitochondria." Nature 1994;370:621-628. 

16. (A deaminase) 
2 0 Adenosine and AMP deaminase signature 

Adenosine deaminase catalyzes the hydrolytic deamination of adenosine into inosine. AMP 
deaminase catalyzes the hydrolytic deamination of AMP into IMP. It has been shown [1] that 
these two types of enzymes share three regions of sequence similarities; these regions are 

2 5 centered on residues which are proposed to play an important role in the catalytic mechanism 

of these two enzymes. One of these regions, containing two conserved aspartic acid residues 
that are potential active site residues was selected. 

Consensus pattern: [SA]-[LIVM]-[NGS]-[STA]-D-D-P [The two D's are putative active site 

3 0 residues] 



[ 1] Chang Z., Nygaard P., Chinault A.C., Kellems R.E. Biochemistry 30:2273-2280(1991). 
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17. (Acetyltransf) 
Acetyltransferase (GNAT) family. 

This family contains proteins with N-acetyltransf erase functions. 

[1] Neuwald AF, Landsman D; Trends Biochem Sci 1997;22:154-155. 

18. (Aconitase C) 
Aconitase family signature 

Aconitase (aconitate hydratase) (EC 4.2.1.3 ) [1] is the enzyme from the tricarboxylic acid 
cycle that catalyzes the reversible isomerization of citrate and isocitrate. Cis-aconitate is 
formed as an intermediary product during the course of the reaction. In eukaryotes two 
isozymes of aconitase are known to exist: one found in the mitochondrial matrix and the 
other found in the cytoplasm. Aconitase, in its active form, contains a 4Fe-4S iron-sulfur 
cluster; three cysteine residues have been shown to be ligands of the 4Fe-4S cluster.lt has 
been shown that the aconitase family also contains the followingproteins: - Iron-responsive 
element binding protein (IRE-BP). IRE-BP is a cytosolic protein that binds to iron-responsive 
elements (IREs). IREs are stem-loop structures found in the 5'UTR of ferritin, and delta 
aminolevulinic acid synthase mRNAs, and in the 3'UTR of transferrin receptor mRNA. IRE- 
BP also express aconitase activity. - 3-isopropylmalate dehydratase (EC 4.2.1.33 ) 
(isopropylmalate isomerase), the enzyme that catalyzes the second step in the biosynthesis of 
leucine. - Homoaconitase (EC 4.2.1.36 ) (homoaconitate hydratase), an enzyme that 
participates in the alpha-aminoadipate pathway of lysine biosynthesis and that converts cis- 
homoaconitate into homoisocitric acid. - Esherichia coli protein ybhJ.As a signature for 
proteins from the aconitase family, two conserved regions that contain the three cysteine 
ligands of the 4Fe-4Scluster were selected. 



Consensus pattern: [LIVM]-x(2)-[GSACIVM]-x-[LIV]-[GTIV]-[STP]-C-x(0,l)-T-N- 
[GSTANI]-x(4)-[LIVMA] [C binds the iron-sulfur center] 
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Consensus pattern: G-x(2)-[LIVWPQ]-x(3)-[GAC]-C-[GSTAM]-[LIMPTA]-C-[LIMV]- 
[GA] [The two C's bind the iron-sulfur center] 

[ 1] Gruer M.J., Artymiuk P.J., Guest J.R. Trends Biochem. Sci. 22:3-6(1997). 

5 

19. (Acyl-CoA dh) 

Acyl-CoA dehydrogenases signatures 

1 0 Acyl-CoA dehydrogenases [1,2,3] are enzymes that catalyze the alpha, beta-dehydrogenation 
of acyl-CoA esters and transfer electrons to ETF, the electron transfer protein. Acyl-CoA 
dehydrogenases are FAD flavoproteins. This family currently includes: - Five eukaryotic 
isozymes that catalyze the first step of the beta-oxidation cycles for fatty acids with various 
chain lengths. These are short (SCAD) (EC 1.3.99.2 ). medium (MCAD) (EC 1.3.99.3 \ long 

1 5 (LCAD) (EC 1.3.99.13 1 very-long (VLCAD) and short/branched (SBCAD) chain acyl-CoA 
dehydrogenases. These enzymes are located in the mitochondrion. They are all 
homotetrameric proteins of about 400 amino acid residues except VLCAD which is a dimer 
and which contains, in its mature form, about 600 residues. - Glutaryl-CoA dehydrogenase 
(EC 1.3.99.7 ) (GCDH), which is involved in the catabolism of lysine, hydroxylysine and 

2 0 tryptophan. - Isovaleryl-CoA dehydrogenase (EC 1.3.99.10 ) (IVD), involved in the 

catabolism of leucine. - Acyl-coA dehydrogenases acsA and mmgC from Bacillus subtilis. - 
Butyryl-CoA dehydrogenase (EC 1,3.99.2 ) from Clostridium acetobutylicum. - Escherichia 
coli protein caiA [4]. - Escherichia coli protein aidB. Two conserved regions were selected as 
signature patterns. The first is located in the center of these enzymes, the second in the C- 

25 terminal section. 

Consensus pattern: [GAC]-[LIVM]-[ST]-E-x(2)-[GSAN]-G-[ST]-D-x(2)-[GSA] 
Consensus pattern: [QDE]-x(2)-G-[GS]-x-G-[LIVMFY]-x(2)-[DEN]-x(4)-[KR]-x(3)- [DEN] 

30 

[ 1] Tanaka K., Ikeda, Matsubara Y., Hyman D.B. Enzyme 38:91-107(1987). 

[ 2] Matsubara Y., Indo Y., Naito E., Ozasa H., Glassberg R., Vockley J., Ikeda Y., Kraus J., 

Tanaka K. J. Biol. Chem. 264:16321-16331(1989). 
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[ 3] Aoyama T., Ueno I., Kamijo T., Hashimoto T. J. Biol. Chem. 269:19088-19094(1994). 
[ 4] Eichler K., Bourgis F., Buchet A., Kleber H.-P., Mandrand-Berthelot M.-A. Mol. 
Microbiol. 13:775-786(1994). 

20. (Acyl transf) 

Acyl transferase domain 

Number of members: 161 

[1] Serre L, Verbree EC, Dauter Z, Stuitje AR, Derewenda ZS; Medline: 95286570 "The 
Escherichia coli malonyl-CoA:acyl carrier protein transacylase at 1.5-A resolution. Crystal 
structure of a fatty acid synthase component." J Biol Chem 1995;270:12961-12964. 

21. Acylphosphatase signatures 

Acylphosphatase (EC 3.6.1.7 ) [1^2] catalyzes the hydrolysis of various acylphosphate 
carboxyl-phosphate bonds such as carbamyl phosphate, succinylphosphate, 1,3- 
diphosphoglycerate, etc. The physiological role of this enzymeis not yet clear. 
Acylphosphatase is a small protein of around 100 amino-acid residues. There are two known 
isozymes. One seems to be specific to muscular tissues, the other, called 'organ-common 
type', is found in many different tissues.While acylphosphatase have been so far only 
characterized in vertebrates,there are a number of bacterial and archebacterial hypothetical 
proteins that are highly similar to that enzyme and that probably possess the same 
activity. These proteins are: - Escherichia coli hypothetical protein yccX. - Bacillus subtilis 
hypothetical protein yflL. - Archaeoglobus fulgidus hypothetical protein AF0818. Two 
conserved regions were selected as signature patterns. The first is located in the N-terminal 
section, while the second is found in the central part ofthe protein sequence. 

Consensus pattern: [LIV]-x-G-x-V-Q-G-V-x-[FM]-R 

Consensus pattern; G-[FYW]-[AVC]-[KRQAM]-N-x(3)-G-x-V-x(5)-G 
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[ 1] Stefani M., Ramponi G. Life Chem. Rep. 12:271-301(1995). 

[ 2] Stefani M., Taddei N., Ramponi G. Cell. Mol. Life Sci. 53:141-151(1997). 

5 

22. (Adap comp sub) 

Clathrin adaptor complexes medium chain signatures. 

Clathrin coated vesicles (CCV) mediate intracellular membrane traffic such asreceptor 
1 0 mediated endocytosis. In addition to clathrin, the CCV are composed of a number of other 

components including oligomeric complexes which are knownas adaptor or clathrin assembly 
proteins (AP) complexes [1]. The adaptor complexes are believed to interact with the 
cytoplasmic tails of membrane proteins, leading to their selection and concentration. In 
mammals two type of adaptor complexes are known: AP-1 which is associated with the Golgi 
15 complex and AP-2 which is associated with the plasma membrane. Both AP-1 and AP-2 are 
heterotetramers that consist of two large chains - the adaptins - (gamma and beta' in AP-1; 
alpha and beta in AP-2); a medium chain (AP47 in AP-1; AP50 inAP-2) and a small chain 
(AP19 in AP-1; AP17 in AP-2). The medium chains of AP-1 and AP-2 are evolutionary 
related proteins of about 50 Kd. Homologs of AP47 and AP50 have also been found in 
2 0 Caenorhabditis elegans (genes unc-101 and ap50) [2] and yeast (gene APMl or YAP54) 
[3]. Some more divergent, but clearly evolutionary related proteins have also been found in 
yeast: APM2 and YBR288c., Two conserved regions were selected as signature patterns, one 
located in the N-terminal region, the other from the central section of these proteins. 

2 5 Consensus pattern: [IVT]-[GSP]-W-R-x(2,3)-[GAD]-x(2)-[HY]-x(2)-N-x- [LIVMAFY](3)- 

D-[LIVM]-[LIVMT]-E 

Consensus pattern: [LIV]-x-F-I-P-P-x-G-x-[LIVMFY]-x-L-x(2)-Y 

3 0 [1] Pearse B.M., Robinson M.S. Annu. Rev. Cell Biol. 6:151-171(1990). 

[ 2] Lee J., Jongeward G.D., Sternberg P.W. Genes Dev. 8:60-73(1994). 

[ 3] Nakayama Y., Goebl M., O'Brine G.B., Lemmon S., Pingchang C.E., Kirchhausen T. 

Eur. J. Biochem. 202:569-574(1991). 
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23. (Adenylsucc synt) 
Adenylosuccinate synthetase signatures 

5 

Adenylosuccinate synthetase (EC 6.3.4.4 ) [1] plays an important role in purinebiosynthesis, 
by catalyzing the GTP-dependent conversion of IMP and aspartic acid to AMP. 
Adenylosuccinate synthetase has been characterized from various sources ranging from 
Escherichia coli (gene purA) to vertebrate tissues. Invertebrates, two isozymes are present - 
1 0 one involved in purine biosynthesis and the other in the purine nucleotide cycle. Two 

conserved regions were selected as signature patterns. The first one is a perfectly conserved 
octapeptide located in the N-terminal section and which is involved in OTP -binding [2]. The 
second one includes a lysine residue known [2] to be essential for the enzyme's activity. 

15 Consensus pattern: Q-W-G-D-E-G-K-G 

Consensus pattern: G-I-[GR]-P-x-Y-x(2)-K-x(2)-R [K is the active site residue] 

[ 1] Wiesmueller L., Wittbrodt J., Noegel A.A., Schleicher M. J. Biol. Chem. 266:2480- 
2 0 2485(1991). 

[ 2] Silva M.M., Poland B.W., Hoffman C.R., Fromm H.J., Honzatko R.B. J. Mol. Biol. 
254:431-446(1995). 

[ 3] Bouyoub A., Barbier G., Forterre P., Labedan B. 2.3.CO:2-"J. Mol. Biol. 261:144- 
154(1996>. 

25 

24. (AdoHcyase) 

S-adenosyl-L-homocysteine hydrolase signatures 

30 S-adenosyl-L-homocysteine hydrolase (EC 3.3.1.1 ) (AdoHcyase) is an enzyme of the 
activated methyl cycle, responsible for the reversible hydratation of S-adenosyl-L- 
homocysteine into adenosine and homocysteine. AdoHcyase is anubiquitous enzyme which 
binds and requires NAD+ as a cofactor. AdoHcyase is a highly conserved protein [1] of about 
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430 to 470 amino acids. Two highly conserved regions were selected as signature patterns. 
The first pattern is located in the N-terminal section; the second is derived from aglycine-rich 
region in the central part of AdoHcyase; a region thought to be involved in NAD-binding. 

Consensus pattern: [GSA]-[CS]-N-x-[FYLM]-S-[ST]-[QA]-[DEN]-x-[AV]-[AT]-[AD]- 
[AC]-[LIVMCG] 

Consensus pattern: [GA]-[KS]-x(3)-[LIV]-x-G-[FY]-G-x-[VC]-G-[KRL]-G-x-[ASC] 

[ 1] Sganga M.W., Aksamit R.R., Cantoni G.L., Bauer C.E. Proc. Natl. Acad. Sci. U.S.A. 
89:6328-6332(1992). 

25. AhpC/TSA family 

This family contains proteins related to alkyl hydroperoxide reductaseComment: (AhpC) and 
thiol specific antioxidant (TSA). 

[1] Chae HZ, Robison K, Poole LB, Church G, Storz G, Rhee SG, Proc Natl Acad Sci U S A 
1994;91:7017-7021 

26. (Aldose epim) 

Aldose 1-epimerase putative active site Aldose 1-epimerase (EC 5.1.3.3) (mutarotase) is the 
enzyme responsible for the anomeric interconversion of D-glucose and other aldoses 
between their alpha- and beta-forms. The sequence of mutarotase from two bacteria, 
Acinetobacter calcoaceticus and Streptococcus thermophilus is available [1]. It has also been 
shown that, on the basis of extensive sequence similarities, a mutarotase domain seem to be 
present in the C-terminal half of the fungal GALIO protein which encodes, in the N-terminal 
part, for UDP-glucose 4-epimerase. The best conserved region in the sequence of 
mutarotase is centered around a conserved histidine residue which may be involved in the 
catalytic mechanism. 
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Consensus pattern: [NS]-x-T-N-H-x-Y-[FW]-N-[LI] 

[ 1] Poolman B., Royer T.J., Mainzer S.E., Schmidt B.F. J. Bacterioi. 172:4037-4047(1990). 
27. (AlkA DNA repair) 

Alkylbase DNA glycosidases alkA family signature 

Alkylbase DNA glycosidases [1] are DNA repair enzymes that hydrolyzes the deoxyribose 
N-glycosidic bond to excise various alkylated bases from a damaged DNA polymer. In 
Escherichia coli there are two alkylbase DNA glycosidases: one (gene tag)which is 
constitutively expressed and which is specific for the removal of 3-methyladenine (EC 
3.2.2.20), and one (gene alkA) which is induced during adaptation to alkylation and which 
can remove a variety of alkylation products (EC 3.2.2.21). Tag and alkA do not share any 
region of sequence similarity. In yeast there is an alkylbase DNA glycosidase (gene MAGI) 
[2,3], which can remove 3-methyladenine or 7-methyladenine and which is structurally 
related to alkA. MAG and alkA are both proteins of about 300 amino acid residues. While 
the C- and N-terminal ends appear to be unrelated, there is a central region of about 130 
residues which is well conserved. A portion of this region has been selected as a signature 
pattern . 

Consensus pattern: G-I-G-x-W-[ST]-[AV]-x-[LIVMFY](2)-x-[LIVM]-x(8)-[MF]-x(2)- 
[ED]-D 

[ 1] Lindahl T., Sedgwick B. Annu. Rev. Biochem. 57:133-157(1988). 

[ 2] Berdal K.G., Bjoras M., Bjelland S., Seeberg E.G. EMBO J. 9:4563-4568(1990). 

[ 3] Chen J., Derfler B., Samson L. EMBO J. 9:4569-4575(1990). 

28. Ammonium transporters signature 



A number of proteins involved in the transport of ammonium ions across amembrane as well 
as some yet uncharacterized proteins have been shown [1,2] to be evolutionary related. These 
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proteins are: - Yeast ammonium transporters MEPl, MEP2 and MEP3. - Arabidopsis 
thaliana high affinity ammonium transporter (gene AMTl). - Corynebacterium glutamicum 
ammonium and methyiammonium transport system. - Escherichia coli putative ammonium 
transporter amtB. - Bacillus subtilis nrgA. - Mycobacterium tuberculosis hypothetical 
5 protein MtCY33 8.09c. - Synechocystis strain FCC 6803 hypothetical proteins sUOlOS, 

sll0537 and slll017. - Methanococcus jannaschii hypothetical proteins MJ0058 and MJ1343. 
- Caenorhabditis elegans hypothetical proteins C05E11.4, F49E11.3 and M195.3. As 
expected by their transport function, these proteins are highly hydrophobic and seem to 
contain from 10 to 12 transmembrane domains. The best conserved region seems to be 
1 0 located in the fifth (or sixth) transmembrane region and is used as a signature pattern. 

Consensus pattern: D-[FYWS]-A-G-[GSC]-x(2)-[IV]-x(3)-[SAG](2)-x(2)-[SAG]- [LIVMF]- 
x(3)-[LIVMFYWA](2)-x-[GK]-x-R 

15 [1] Ninnemann O., Janniaux J.-C, Frommer W.B. EMBO J. 13:3464-3471(1994). 

[ 2] Siewe R.M., Weil B., Burkovski A., Eikmanns B.J., Eikmanns M., Kraemer R. J. Biol. 
Chem. 271:5398-5403(1996). 

[ 3] Saier M.H. Jr. Adv. Microbiol. Physiol. 40:81-136(1998). 

20 

29. (Arch_histone) 
CBF/NF-Y subunits signatures 

Diverse DNA binding proteins are known to bind the CCAAT box, a common cis-acting 

2 5 element found in the promoter and enhancer regions of a large number of genes in 

eukaryotes. Amongst these proteins is one known as the CCAAT -binding factor (CBF) or 
NF-Y [1]. CBF is a heteromeric transcription factor that consists of two different components 
both needed for DNA-binding. The HAP protein complex of yeast binds to the upstream 
activation site of cytochrome C iso-1 gene (CYCl) as well as other genes involved in 

3 0 mitochondrial electron transport and activates their expression. It also recognizes the 

sequence CCAAT and is structurally and evolutionary related to CBF. The first subunit of 
CBF, known as CBF- A or NF-YB in vertebrates, HAP3 in budding yeast and as php3 in 
fission yeast, is a protein of 116 to 210 amino-acid residues which contains a highly 
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conserved central domain of about 90residues. This domain seems to be involved in DNA- 
binding; a signature pattern had been developed from its central part. The second subunit of 
CBF, known as CBF-B or NF-YA in vertebrates, HAP2 in budding yeast and php2 in fission 
yeast, is a protein of 265 to 350 amino-acid residues which contains a highly conserved 
5 region of about 60 residues. This region, called the 'essential core' [2], seems to consist of two 
subdomains: an N-terminal subunit-association domain and a C-terminal DNA recognition 
domain. A signature pattern has been developed from a section of the subunit-association 
domain. 

1 0 Consensus pattern: C-V-S-E-x-I-S-F-[LIVM]-T-[SG]-E-A-[SC]-[DE]-[KRQ]-C- 

Consensus pattern: Y-V-N-A-K-Q-Y-x-R-I-L-K-R-R-x-A-R-A-K-L-E- 

[ 1] Li X.-Y., Mantovani R., Hooft van Huijsduijnen R., Andre I., Benoist C, Mathis D. 
15 Nucleic Acids Res. 20:1087-1091(1992). 

[ 2] Olesen J.T., Fikes J.D., Guarente L. Mol. Cell. Biol. 11:611-619(1991). 

30. Argininosuccinate synthase signatures 

20 

Argininosuccinate synthase (EC 6.3.4.5 ) (AS) is a urea cycle enzyme that catalyzes the 
penultimate step in arginine biosynthesis: the ATP-dependent ligation of citruUine to 
aspartate to form argininosuccinate, AMP andpyrophosphate [1,2]. In humans, a defect in the 
AS gene causes citrullinemia, a genetic disease characterized by severe vomiting spells and 

2 5 mental retardation.AS is a homotetrameric enzyme of chains of about 400 amino-acid 

residues. Anarginine seems to be important for the enzyme's catalytic mechanism. The 
sequences of AS from various prokaryotes, archaebacteria and eukaryotes show significant 
similarity. Two signature patterns have been selected for AS. The first is a highly conserved 
stretch of nine residues located in the N-terminal extremity of these enzymes, the second is 

3 0 derived from a conserved region which contains one of the conserved arginine residues. 



Consensus pattern: [AS]-[FY]-S-G-G-[LV]-D-T-[ST]- 
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Consensus pattern: G-x-T-x-K-G-N-D-x(2)-R-F- 

[ 1] van Vliet F., Crabeel M., Boyen A., Tricot C, Stalon V., Falmagne P., Nakamura Y., 
Baumberg S., Glansdorff N. Gene 95:99-104(1990). 
5 [ 2] Morris C.J., Reeve J.N. J. Bacteriol. 170:3125-3130(1988). 

31. Armadillo/beta-catenin-Iike repeats 

Approx. 40 amino acid repeat. Tandem repeats form super-helix of helices that is proposed to 
1 0 mediate interaction of beta-catenin with its ligands. CAUTION: This family does not contain 
all known armadillo repeats. 

[1] Huber AH, Nelson WJ, Weis WI, Cell 1997;90:871-882. 
[2] Gumbiner BM, Curr Opin Cell Biol 1995;7:634-640. 
1 5 [3] Cavallo R, Rubenstein D, Peifer M, Curr Opin Genet Dev 1997;7:459-466. 
[4] Su LK, Vogelstein B, Kinzler KW, Science 1993;262:1734-1737. 
[5] Masiarz FR, Munemitsu S, Polakis P Science 1993;262:1731-1734 
[6] Peifer M, Wieschaus E, Cell 1990;63:1167-1176. 

20 

32. (Asn Synthase) 
Asparagine synthase 

This family is always found associated with GATase_2 . Members of this family catalyse the 
2 5 conversion of aspartate to asparagine. 

33. Asparaginase_2 
Asparaginase 12 members 

30 

34. (Aspartyl tRNA N) 

Aminoacyl-transfer RNA synthetases class-II signatures 
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Aminoacyl-tRNA synthetases (EC 6.1.1.-) [1] are a group of enzymes which activate amino 
acids and transfer them to specific tRNA molecules as the first step in protein biosynthesis. In 
prokaryotic organisms there are at least twenty different types of aminoacyl-tRNA 
5 synthetases, one for each different amino acid. In eukaryotes there are generally two 
aminoacyl-tRNA synthetases for each different amino acid: one cytosolic form and a 
mitochondrial form. While all these enzymes have a common function, they are widely 
diverse in terms of subunit size and of quaternary structure. The synthetases specific for 
alanine, asparagine, aspartic acid, glycine, histidine, lysine, phenylalanine, proline, serine, 
10 and threonine are referred to as class-II synthetases [2 to 6] and probably have a common 

folding pattern in their catalytic domain for the binding of ATP and amino acid which is 
different to the Rossmann fold observed for the class I synthetases [7]. Class-II tRNA 
synthetases do not share a high degree of similarity, however at least three conserved regions 
are present [2,5,8]. Signature patterns have been derived from two of these regions. 

15 

Consensus pattern: [FYH]-R-x-[DE]-x(4,12)-[RH]-x(3)-F-x(3)-[DE] 

Consensus pattern: [GSTALVF] - {DENQHRKP} -[GSTA] - [LIVMF]-[DE] -R-[LIVMF] -x- 

[LIVMSTAG]-[LIVMFY] 

2 0 [1] Schimmel P. Annu. Rev. Biochem. 56:125-158(1987). 
[ 2] Delarue M., Moras D. BioEssays 15:675-687(1993). 
[ 3] Schimmel P. Trends Biochem. Sci. 16:1-3(1991). 

[ 4] Nagel G.M., Doolittle R.F. Proc. Natl. Acad. Sci. U.S.A. 88:8121-8125(1991). 
[ 5] Cusack S., Haertlein M., Leberman R. Nucleic Acids Res. 19:3489-3498(1991). 
2 5 [6] Cusack S. Biochimie 75:1077-1081(1993). 

[ 7] Cusack S., Berthet-Colominas C, Haertlein M., Nassar N., Leberman R. Nature 347:249- 
255(1990). 

[ 8] Leveque F., Plateau P., Dessen P., Blanquet S. Nucleic Acids Res. 18:305-312(1990). 

30 

35. (ArfGap) Putative GTP-ase activating protein for Arf. Putative zinc fingers with GTPase 
activating proteins (GAPs) towards the small GTPase, Arf. The GAP of ARDl stimulates 
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GTPase hydrolysis for ARDl but not ARFs. Number of members: 34 

[l]Medline: 96324970. Identification and cloning of centaur in-alpha. A novel 
phosphatidylinositol 3,4,5-trisphosphate-binding protein from rat brain. Hammonds-Odie LP, 
Jackson TR, Profit AA, Blader IJ, Turck CW, Prestwich GD, Theibert AB; J Biol Chem 
1996;271:18859-18868. 

[2]Medline: 97296423. A target of phosphatidylinositol 3,4,5-trisphosphate with a zinc finger 
motif similar to that of the ADP-ribosylation -factor GTPase-activating protein and two 
pleckstrin homology domains. Tanaka K, Imajoh-Ohmi S, Sawada T, Shirai R, Hashimoto Y, 
Iwasaki S, Kaibuchi K, Kanaho Y, Shirai T, Terada Y, Kimura K, Nagata S, Fukui Y; Eur J 
Biochem 1997;245:512-519. 

[3] 98112795. Molecular characterization of the GTPase-activating domain of ADP- 
ribosylation factor domain protein 1 (ARDl). Vitale N, Moss J, Vaughan M; J Biol Chem 
1998;273:2553-2560. 

36. Apolipoprotein. Apolipoprotein A1/A4/E family. This family includes; Swiss:P02647 
Apolipoprotein A-I. Swiss:P06727 Apolipoprotein A-IV. Swiss:P02649 Apolipoprotein E. 
These proteins contain several 22 residue repeats which form a pair of alpha helices. Number 
of members: 42 

[l]Medline: 91289138. Three-dimensional structure of the LDL receptor-binding domain of 
human apolipoprotein E. Wilson C, Wardell MR, Weisgraber KH, Mahley RW, Agard DA; 
Science 1991;252:1817-1822. 

37. Amino acid permeases signature 

Amino acid permeases are integral membrane proteins involved in the transport of amino 
acids into the cell. A number of such proteins have been found to be evolutionary related 
[1,2,3]- These proteins are: - Yeast general amino acid permeases (genes GAPl, AGP2 and 
AGP3). - Yeast basic amino acid permease (gene ALPl). - Yeast Leu/Val/Ile permease (gene 
BAP2). - Yeast arginine permease (gene CANl). - Yeast dicarboxylic amino acid permease 
(gene DIP5). - Yeast asparagine/glutamine permease (gene AGPl). - Yeast glutamine 
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permease (gene GNPl). - Yeast histidine permease (gene HIPl). - Yeast lysine permease 
(gene LYPl). - Yeast proline permease (gene PUT4). - Yeast valine and tyrosine permease 
(gene VALl/TATl). - Yeast tryptophan permease (gene TAT2/SCM2). - Yeast choline 
transport protein (gene HNMl/CTRl). - Yeast GABA permease (gene UGA4). - Yeast 
hypothetical protein YKL174c. - Fission yeast protein isp5. - Fission yeast hypothetical 
protein SpAC8A4.11 - Fission yeast hypothetical protein SpACllD3.08c. - Emericella 
nidulans proline transport protein (gene prnB). - Trichoderma harzianum amino acid 
permease INDAl. - Salmonella typhimurium L-asparagine permease (gene ansP). - 
Escherichia coli aromatic amino acid transport protein (gene aroP). - Escherichia coli D- 
serine/D-alanine/glycine transporter (gene cycA). - Escherichia coli GABA permease (gene 
gabP). - Escherichia coli lysine-specific permease (gene lysP). - Escherichia coli 
phenylalanine-specific permease (gene pheP). - Salmonella typhimurium proline-specific 
permease (gene proY). - Escherichia coli and Klebsiella pneumoniae hypothetical protein 
yeeF. - Escherichia coli and Salmonella typhimurium hypothetical protein yifK. - Bacillus 
subtilis permeases rocC and rocE which probably transports arginine or ornithine. These 
proteins seem to contain up to 12 transmembrane segments. As a signature for this family of 
proteins, the best conserved region which is located in the second transmembrane segment 
has been selected. 

Consensus pattern: [STAGC]-G-[PAG]-x(2,3)-[LIVMFYWA](2)-x-[LIVMFYW]-x- 
[LIVMFWSTAGC](2)-[STAGC]-x(3)-[LIVMFYWT]-x-[LIVMST]-x(3)- [LIVMCTA]- 
[GA]-E-x(5)-[PSAL]- 

[ 1] Weber E., Chevalier M.R., Jund R. J. Mol. Evol. 27:341-350(1988). 

[ 2] Vandenbol M., Jauniaux J.-C, Grenson M. Gene 83:153-159(1989). 

[ 3] Reizer J., Finley K., Kakuda D., McLeod C.L., Reizer A., Saier M.H. Jr. Protein Sci. 

2:20-30(1993). 

38. aakinase (1) Glutamate 5-kinase signature 

Glutamate 5-kinase (EC 2.7.2.11 ) (gamma-glutamyl kinase) (GK) is the enzyme that 
catalyzes the first step in the biosynthesis of proline from glutamate, the ATP-dependent 
phosphorylation of L-glutamate into L-glutamate 5-phosphate. In eubacteria (gene proB) and 
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yeast [1] (gene PROl), GK is a monofunctional protein, while in plants and mammals, it is a 
bifunctional enzyme (P5CS) [2]that consists of two domains: a N-terminal GK domain and a 
C-terminal gamma-glutamyl phosphate reductase domain (EC 1.2.1.41 ^ (see 
<PDOC00940>).As a signature pattern, a highly conserved glycine-and alanine-rich region 
5 located in the central section of these enzymes has been selected. Yeast hypothetical protein 
YHR033W is highly similar to GK. 

Consensus pattern: [GSTN]-x(2)-G-x-G-[GC]-[IM]-x-[STA]-K-[LIVM]-x-[SA]-[TCA]- 
x(2)-[GALV]-x(3)-G- 

10 

[ 1] Li W., Brandriss M.C. J. Bacterid. 174:4148-4156(1992). 

[ 2] Hu C.-A.A., Delauney A.J., Verma D.P.S. Proc. Natl. Acad. Sci. U.S.A. 89:9354- 
9358(1992). 

1 5 aakinase (2) Aspartokinase signature 

Aspartokinase (EC 2.7.2.4 ) (AK) [1] catalyzes the phosphorylation of aspartate. The product 
of this reaction can then be used in the biosynthesis of lysine or in the pathway leading to 
homoserine, which participates in the biosynthesis of threonine, isoleucine and methionine. In 
Escherichia coli, there are three different isozymes which differ in their sensitivity to 

2 0 repression and inhibition by Lys, Met and Thr. AKl (gene thrA) and AK2 (gene metL) are 
bifunctional enzymes which both consist of an N- terminal AK domain and a C-terminal 
homoserine dehydrogenase domain. AKl is involved in threonine biosynthesis and AK2, in 
that of methionine. The third isozyme, AK3 (gene lysC), is monofunctional and involved in 
lysine synthesis. In yeast, there is a single isozyme of AK (gene H0M3). As a signature 

2 5 pattern for AK, a conserved region located in the N-terminal extremity has been selected. 

Consensus pattern: [LIVM]-x-K-[FY]-G-G-[ST]-[SC]-[LIVM]- 
[ 1] Rafalski J.A., Falco S.C. J. Biol. Chem. 263:2146-2151(1988). 

30 

aakinase (3) Gamma-glutamyl phosphate reductase signature 

Gamma-glutamyl phosphate reductase (EC 1.2.1.41 ) (GPR) is the enzyme that catalyzes the 
second step in the biosynthesis of proline from glutamate, the NADP-dependent reduction of 
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L-glutamate 5-phosphate into L-glutamate 5-semialdehyde and phosphate. In eubacteria 
(gene pro A) and yeast [1] (gene PR02), GPR is a monofunctional protein, while in plants and 
mammals, it is a bifunctional enzyme (P5CS) [2]that consists of two domains: a N-terminal 
glutamate 5-kinase domain(EC 2.7.2.11^ (see < PDOC00701 >) and a C-terminal GPR 
5 domain. As a signature pattern, a conserved region that contains two histidine residues has 
been selected. This region is located in the last third of GPR. 

Consensus pattern: V-x(5)-A-[LIV]-x-H-I-x(2)-[HY]-[GS]-[ST]-x-H-[ST]-[DE]-x- 1- 

10 [1] Pearson B.M., Hernando Y., Payne J., Wolf S.S., Kalogeropoulos A., Schweizer M. 
Yeast 12:1021-1031(1996). 

[ 2] Hu C.-A.A., Delauney A.J., Verma D.P.S. Proc. Natl. Acad. Sci. U.S.A. 89:9354- 
9358(1992). 

15 

39. (abhydrolase) alpha/beta hydrolase fold. This catalytic domain is found in a very wide 
range of enzymes. 

[1] OUis DL, Cheah E, Cygler M, Dijkstra B, Frolow F, Franken SM, Harel M, Remington 
2 0 SJ, Silman I, Schrag J, Sussman JL, Verschueren KHG, Goldman A, Protein Eng 
1992;5:197-211. 

40. (Acid phosphat) Histidine acid phosphatases signatures 

2 5 Acid phosphatases (EC 3.1.3.2) are a heterogeneous group of proteins that hydrolyze 

phosphate esters, optimally at low pH. It has been shown [1] that a number of acid 
phosphatases, from both prokaryotes and eukaryotes, share two regions of sequence 
similarity, each centered around a conserved histidine residue. These two histidines seem 
to be involved in the enzymes' catalytic mechanism [2,3]. The first histidine is located in the 

3 0 N-terminal section and forms a phosphohistidine intermediate while the second is located in 

the C- terminal section and possibly acts as proton donor. Enzymes belonging to this family 
are called 'histidine acid phosphatases' and are listed below: 
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- Escherichia coii pH 2.5 acid phosphatase (gene appA). 

□ 

- Escherichia coIi gIucose-1 -phosphatase (EC 3.1.3.10) (gene agp). 

- Yeast constitutive and repressible acid phosphatases (genes PH03 and PH05). 
5 - Fission yeast acid phosphatase (gene phol). 

- Aspergillus phytases A and B (EC 3.1.3.8) (gene phyA and phyB). 

- Mammalian lysosomal acid phosphatase. 

- Mammalian prostatic acid phosphatase. 

- Caenorhabditis elegans hypothetical proteins B0361.7, C05C10.1, C05C10.4 
10 andF26Cll.l. 

Consensus pattern[LIVM]-x(2)-[LIVMA]-x(2)-[LIVM]-x-R-H-[GN]-x-R-x-[PAS] [H is the 
phosphohistidine residue] 

15 Consensus pattern[LIVMF]-x-[LIVMFAG]-x(2)-[STAGI]-H-D-[STANQ]-x-[LIVM]-x(2)- 
[LIVMFY]-x(2)-[STA] [H is an active site residue] Sequences known to belong to this class 
detected by the pattern ALL, except for rat prostatic acid phosphatase which seems to have 
Tyr instead of the active site His 

2 0 [1] van Etten R.L., Davidson R., Stevis P.E., MacArthur H., Moore D.L. J. Biol. Chem. 
266:2313-2319(1991). 

[ 2] Ostanin K., Harms E.H., Stevis P.E., Kuciel R., Zhou M.-M., van Etten R.L. J. Biol. 
Chem. 267:22830-22836(1992). 

[ 3] Schneider G., Lindqvist Y., Vihko P. EMBO J. 12:2609-2615(1993). 

25 

41. Aconitase family signatures 

Aconitase (aconitate hydratase) (EC 4.2.1.3 ) [1] is the enzyme from the tricarboxylic acid 
cycle that catalyzes the reversible isomerization of citrate and isocitrate. Cis-aconitate is 
30 formed as an intermediary product during the course of the reaction. In eukaryotes two 
isozymes of aconitase are known to exist: one found in the mitochondrial matrix and the 
other found in the cytoplasm. Aconitase, in its active form, contains a 4Fe-4S iron-sulfur 
cluster; three cysteine residues have been shown to be ligands of the 4Fe-4S cluster. It has 
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been shown that the aconitase family also contains the following proteins: - Iron-responsive 
element binding protein (IRE-BP). IRE-BP is a cytosolic protein that binds to iron-responsive 
elements (IREs). IREs are stem-loop structures found in the 5'UTR of ferritin, and delta 
aminolevulinic acid synthase mRNAs, and in the 3'UTR of transferrin receptor mRNA. IRE- 
5 BP also express aconitase activity. - 3-isopropylmalate dehydratase (EC 4.2.1.33 ) 

(isopropylmalate isomerase), the enzyme that catalyzes the second step in the biosynthesis of 
leucine. - Homoaconitase (EC 4.2.1.36 ) (homoaconitate hydratase), an enzyme that 
participates in the alpha-aminoadipate pathway of lysine biosynthesis and that converts cis- 
homoaconitate into homoisocitric acid. - Esherichia coli protein ybhJ 

10 

Consensus pattern: [LIVM]-x(2)-[GSACIVM]-x-[LIV]-[GTIV]-[STP]-C-x(0,l)-T-N- 
[GSTANI]-x(4)-[LIVMA] [C binds the iron-sulfur center] 

Consensus pattern: G-x(2)-[LIVWPQ]-x(3)-[GAC]-C-[GSTAM]-[LIMPTA]-C-[LIMV]- 
[GA] [The two Cs bind the iron-sulfur center]- 

15 

[ 1] Gruer M.J., Artymiuk P.J., Guest J.R. Trends Biochem. Sci. 22:3-6(1997). 
42. Actins signatures 

2 0 Actins [1 to 4] are highly conserved contractile proteins that are present in all eukaryotic 
cells. In vertebrates there are three groups of actin isoforms: alpha, beta and gamma. The 
alpha actins are found in muscle tissues and are a major constituent of the contractile 
apparatus. The beta and gamma actins co-exists in most cell types as components of the 
cytoskeleton and as mediators of internal cell motility. In plants [5] there are many isoforms 

2 5 which are probably involved in a variety of functions such as cytoplasmic streaming, cell 

shape determination, tip growth, graviperception, cell wall deposition, etc. Actin exists either 
in a monomeric form (G-actin) or in a polymerized form (F-actin). Each actin monomer can 
bind a molecule of ATP; when polymerization occurs, the ATP is hydrolyzed. Actin is a 
protein of from 374 to 379 amino acid residues. The structure of actin has been highly 

3 0 conserved in the course of evolution. Recently some divergent actin-like proteins have been 

identified in several species. These proteins are: - Centractin (actin-RPV) from mammals, 
fungi (yeast ACTS, Neurospora crassa ro-4) and Pneumocystis carinii (actin-II). Centractin 
seems to be a component of a multi-subunit centrosomal complex involved in microtubule 
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based vesicle motility. This subfamily is also known as ARPl. - ARP2 subfamily which 
includes chicken ACTL, yeast ACT2, Drosophila 14D, C.elegans actC. - ARP3 subfamily 
which includes actin 2 from mammals, Drosophila 66B, yeast ACT4 and fission yeast act2. - 
ARP4 subfamily which includes yeast ACTS and Drosophila 13E. Three signature patterns 
have been developed. The first two are specific to actins and span positions 54 to 64 and 357 
to 365. The last signature picks up both actins and the actin-like proteins and corresponds to 
positions 106 to 118 in actins. 

Consensus pattern: [FY]-[LIV]-G-[DE]-E-A-Q-x-[RKQ](2)-G- 
Consensus pattern: W-[IV]-[STA]-[RK]-x-[DE]-Y-[DNE]-[DE]- 

Consensus pattern: [LM]-[LIVM]-T-E-[GAPQ]-x-[LIVMFYWHQ]-N-[PSTAQ]-x(2)-N- 
[KR]- 

[ 1] Sheterline P., Clayton J., Sparrow J.C. (In) Actins, 3rd Edition, Academic Press Ltd, 
London, (1996). 

[ 2] Pollard T.D., Cooper J.A. Annu. Rev. Biochem. 55:987-1036(1986). 
[ 3] Pollard T.D. Curr. Opin. Cell Biol. 1:33-40(1990). 
[ 4] Rubenstein P.A. BioEssays 12:309-315(1990). 

[ 5] Meagher R.B., McLean B.G. Cell Motil. Cytoskeleton 16:164-166(1990). 
43. Adenylate kinase signature 

Adenylate kinase (EC 2.7.4.3 ) (AK) [1] is a small monomeric enzyme that catalyzes the 
reversible transfer of MgATP to AMP (MgATP + AMP = MgADP + ADP).In mammals 
there are three different isozymes: - AKl (or myokinase), which is cytosolic. - AK2, which is 
located in the outer compartment of mitochondria. - AK3 (or GTP:AMP phosphotransferase), 
which is located in the mitochondrial matrix and which uses MgGTP instead of MgATP.The 
sequence of AK has also been obtained from different bacterial species and from plants and 
fungi. Two other enzymes have been found to be evolutionary related to AK. These are: - 
Yeast uridylate kinase (EC 2.7.4.-) (UK) (gene URA6) [2] which catalyzes the transfer of a 
phosphate group from ATP to UMP to form UDP and ADP. - Slime mold UMP-CMP kinase 
(EC 2.7.4.14 ) [3] which catalyzes the transfer of a phosphate group from ATP to either CMP 
or UMP to form CDP or UDP and ADP. Several regions of AK family enzymes are well 
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conserved, including the ATP -binding domains. The most conserved of all regions have been 
selected as a signature for this type of enzyme. This region includes an aspartic acid residue 
that is part of the catalytic cleft of the enzyme and that is involved in a salt bridge. It also 
includes an arginine residue whose modification leads to inactivation of the enzyme 

Consensus pattern: [LIVMFYW](3)-D-G-[FYI]-P-R-x(3)-[NQ]- 

[ 1] Schulz G.E. Cold Spring Harbor Symp. Quant. Biol. 52:429-439(1987). 

[ 2] Liljelund P., Sanni A., Friesen J.D., Lacroute F. Biochem. Biophys. Res. Commun. 

165:464-473(1989). 

[ 3] Wiesmueller L., Noegel A.A., Barzu O., Gerisch G., Schleichei M. J. Biol. Chem. 
265:6339-6345(1990). 

[ 4] Kath T.H., Schmid R., Schaefer G. Arch. Biochem. Biophys. 307:405-410(1993). 

44. (adh_short) Short-chain dehydrogenases/reductases family signature. The short-chain 
dehydrogenases/reductases family (SDR) [1] is a very large family of enzymes, most of 
which are known to be NAD- or NADP-dependent oxidoreductases. As the first member of 
this family to be characterized was Drosophila alcohol dehydrogenase, this family used to be 
called [2,3,4]'insect-type', or 'short-chain' alcohol dehydrogenases. Most member of this 
family are proteins of about 250 to 300 amino acid residues. The proteins currently known to 
belong to this family are listed below. - Alcohol dehydrogenase (EC 1.1.1.1 ) from insects 
such as Drosophila. - Acetoin dehydrogenase (EC 1.1.1.5 ) from Klebsiella terrigena (gene 
budC). - D-beta-hydroxybutyrate dehydrogenase (BDH) (EC 1.1.1.30 ) from mammals. - 
Acetoacetyl-CoA reductase (EC 1.1.1.36 ) from various bacterial species (gene phbB or 
phaB). - Glucose 1-dehydrogenase (EC 1.1.1.47 ) from Bacillus. - 3-beta-hydroxysteroid 
dehydrogenase (EC 1.1.1.51 ) from Comomonas testosteroni. - 20-beta-hydroxysteroid 
dehydrogenase (EC 1.1.1.53 ) from Streptomyces hydrogenans. - Ribitol dehydrogenase (EC 
1.1.1.56 ) (RDH) from Klebsiella aerogenes. - Estradiol 17-beta-dehydrogenase (EC 1.1.1.62 ) 
from human. - Gluconate 5 -dehydrogenase (EC 1.1.1.69 ) from Gluconobacter oxydans (gene 
gno). - 3-oxoacyl-[acyl-carrier protein] reductase (EC 1.1.1.100 ) from Escherichia coli (gene 
fabG) and from plants. - Retinol dehydrogenase (EC 1.1.1.105 ) from mammals. - 2-deoxy-d- 
gluconate 3-dehydrogenase (EC 1.1.1.125 ) from Escherichia coli and Erwinia chrysanthemi 
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(gene kduD). - Sorbitol-6-phosphate 2-dehydrogenase (EC 1.1.1.140 ) from Escherichia coli 
(gene gutD) and from Klebsiella pneumoniae (gene sorD). - 15-hydroxyprostaglandin 
dehydrogenase (NAD+) (EC 1.1.1.141 ) from human. - Corticosteroid 11-beta-dehydrogenase 
(EC 1.1.1.146 ) (11-DH) from mammals. - 7-alpha-hydroxysteroid dehydrogenase (EC 
5 1.1.1.159 ) from Escherichia coli (gene hdhA), Eubacterium strain VPI 12708 (gene baiA) and 
from Clostridium sordellii. - NADPH-dependent carbonyl reductase (EC 1.1.1.184 ) from 
mammals. - Tropinone reductase-I (EC 1.1.1.206 ) and -II (EC 1. 1.1.236 ) from plants. - N- 
acylmannosamine 1 -dehydrogenase (EC 1.1.1.233 ) from Flavobacterium strain 141-8. - D- 
arabinitol 2-dehydrogenase (ribulose forming) (EC 1.1.1.250 ) from fungi. - 
10 Tetrahydroxynaphthalene reductase (EC 1.1.1.252 ) from Magnaporthe grisea. - Pteridine 
reductase 1 (EC 1.1.1.253 ) (gene PTRl) from Leishmania. - 2,5-dichloro-2,5- 
cyclohexadiene-l,4-diol dehydrogenase (EC 1.1.-.-) from Pseudomonas paucimobilis. - Cis- 

1.2- dihydroxy-3,4-cyclohexadiene-l-carboxylate dehydrogenase (EC 1.3.1. -) from 
Acinetobacter calcoaceticus (gene benD) and Pseudomonas putida (gene xylL). - Biphenyl- 

15 2,3-dihydro-2,3-diol dehydrogenase (EC 1.3.1.-) (gene bphB) from various Pseudomonaceae. 

- Cis-toluene dihydrodiol dehydrogenase (EC 1.3.1.-) from Pseudomonas putida (gene todD). 

- Cis-benzene glycol dehydrogenase (EC 1.3.1.19 ) from Pseudomonas putida (gene bnzE). - 

2.3- dihydro-23-dihydroxybenzoate dehydrogenase (EC 1.3.1.28 ) from Escherichia coli (gene 
entA) and Bacillus subtilis (gene dhbA). - Dihydropteridine reductase (EC 1.6.99.7 ) 

2 0 (HDHPR) from mammals. - Lignin degradation enzyme ligD from Pseudomonas 

paucimobilis. - Agropine synthesis reductase from Agrobacterium plasmids (gene masl). - 
Versicolorin reductase from Aspergillus parasiticus (gene VERl). - Putative keto-acyl 
reductases from Streptomyces polyketide biosynthesis operons. - A trifunctional hydratase- 
dehydrogenase-epimerase from the peroxisomal beta-oxidation system of Candida tropicalis. 

2 5 This protein contains two tandemly repeated 'short-chain dehydrogenase-type' domain in its 

N-terminal extremity. - Nodulation protein nodG from species of Azospirillum and 
Rhizobium which is probably involved in the modification of the nodulation Nod factor fatty 
acyl chain. - Nitrogen fixation protein fixR from Bradyrhizobium japonicum. - Bacillus 
subtilis protein dltE which is involved in the biosynthesis of D- alanyl-lipoteichoic acid. - 

3 0 Human follicular variant translocation protein 1 (FVTl). - Mouse adipocyte protein p27. - 

Mouse protein Ke 6. - Maize sex determination protein TASSELSEED 2. - Sarcophaga 
peregrina 25 Kd development specific protein. - Drosophila fat body protein P6. - A Listeria 
monocytogenes hypothetical protein encoded in the internalins gene region. - Escherichia coli 
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hypothetical protein yciK. - Escherichia coli hypothetical protein ydfG. - Escherichia coli 
hypothetical protein yjgl. - Escherichia coli hypothetical protein yjgU. - Escherichia coli 
hypothetical protein yohF. - Bacillus subtilis hypothetical protein yoxD. - Bacillus subtilis 
hypothetical protein ywfD. - Bacillus subtilis hypothetical protein ywfH. - Yeast hypothetical 
5 protein YIL124w. - Yeast hypothetical protein YIR035c. - Yeast hypothetical protein 
YIR036c. - Yeast hypothetical protein YKL055c. - Fission yeast hypothetical protein 
SpAC23D3.11. One of the best conserved regions which includes two perfectly conserved 
residues, a tyrosine and a lysine has been selected as a signature pattern for this family of 
proteins. The tyrosine residue participates in the catalytic mechanism. 

10 

Consensus pattern: [LIVSPADNK]-x(12)-Y-[PSTAGNCV]-[STAGNQCIVM]-[STAGC]-K- 
{PC}-[SAGFYR]-[LIVMSTAGD]-x(2)-[LIVMFYW]-x(3)- [LIVMFYWGAPTHQ]- 
[GSACQRHM] [Y is an active site residue] - 

15 [1] Joernvall H., Persson B., Krook M., Atrian S., Gonzalez-Duarte R., Jeffery J., Ghosh D. 
Biochemistry 34:6003-6013(1995). 

[ 2] Villarroya A., Juan E., Egestad B., Joernvall H. Eur. J. Biochem. 180:191-197(1989). 
[ 3] Persson B., Krook M., Joernvall H. Eur. J. Biochem. 200:537-543(1991). 
[ 4] Neidle E.L., Hartnett C, Ornston N.L., Bairoch A., Rekik M., Harayama S. Eur. J. 
2 0 Biochem. 204:113-120(1992). 

45. (adh_short_C2) Short-chain dehydrogenases/reductases family signature 

The short-chain dehydrogenases/reductases family (SDR) [1] is a very large family of 

2 5 enzymes, most of which are known to be NAD- or NADP-dependent oxidoreductases. As the 

first member of this family to be characterized was Drosophila alcohol dehydrogenase, this 
family used to be called [2,3,4]'insect-type', or 'short-chain' alcohol dehydrogenases. Most 
member of this family are proteins of about 250 to 300 amino acid residues. The proteins 
currently known to belong to this family are listed below. - Alcohol dehydrogenase (EC 

3 0 1.1.1.1 ) from insects such as Drosophila. - Acetoin dehydrogenase (EC 1.1.1.5 > from 

Klebsiella terrigena (gene budC). - D-beta-hydroxybutyrate dehydrogenase (BDH) (EC 
1.1.1.30 ) from mammals. - Acetoacetyl-CoA reductase (EC 1.1.1.36 ) from various bacterial 
species (gene phbB or phaB). - Glucose 1 -dehydrogenase (EC 1.1.1.47 ) from Bacillus. - 3- 
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beta-hydroxysteroid dehydrogenase (EC 1.1.1.51 ) from Comomonas testosteroni. - 20-beta- 
hydroxysteroid dehydrogenase (EC 1.1.1.53 ) from Streptomyces hydrogenans. - Ribitol 
dehydrogenase (EC 1.1.1.56 ) (RDH) from Klebsiella aerogenes. - Estradiol 17-beta- 
dehydrogenase (EC 1.1.1.62 ) from human. - Gluconate 5-dehydrogenase (EC 1.1.1.69 ) from 
5 Gluconobacter oxydans (gene gno). - 3-oxoacyl-[acyl-carrier protein] reductase (EC 

1.1.1.100 ) from Escherichia coli (gene fabG) and from plants. - Retinol dehydrogenase (EC 
1.1.1.105 ) from mammals. - 2-deoxy-d-gluconate 3-dehydrogenase (EC 1.1.1.125 ) from 
Escherichia coli and Erwinia chrysanthemi (gene kduD). - Sorbitol-6-phosphate 2- 
dehydrogenase (EC 1.1.1.140 ) from Escherichia coli (gene gutD) and from Klebsiella 

10 pneumoniae (gene sorD). - 15-hydroxyprostaglandin dehydrogenase (NAD+) (EC 1.1.1.141 ) 
from human. - Corticosteroid 11-beta-dehydrogenase (EC 1.1.1.146 ) (11-DH) from 
mammals. - 7-alpha-hydroxysteroid dehydrogenase (EC 1.1.1.159 ) from Escherichia coli 
(gene hdhA), Eubacterium strain VPI 12708 (gene baiA) and from Clostridium sordellii. - 
NADPH-dependent carbonyl reductase (EC 1.1.1.184 ) from mammals. - Tropinone 

15 reductase-I (EC 1.1.1.206 ) and -II (EC 1.1.1.236 ) from plants. - N-acylmannosamine 1- 
dehydrogenase (EC 1.1.1.233 ) from Flavobacterium strain 141-8. - D-arabinitol 2- 
dehydrogenase (ribulose forming) (EC 1.1.1.250 ) from fungi. - Tetrahydroxynaphthalene 
reductase (EC 1.1.1.252 ) from Magnaporthe grisea. - Pteridine reductase 1 (EC 1.1.1.253 ) 
(gene PTRl) from Leishmania. - 2,5-dichloro-2,5-cyclohexadiene-l,4-diol dehydrogenase 

2 0 (EC 1.1.-.-) from Pseudomonas paucimobilis. - Cis-l,2-dihydroxy-3,4-cyclohexadiene-l- 

carboxylate dehydrogenase (EC 1.3.1. -) from Acinetobacter calcoaceticus (gene benD) and 
Pseudomonas putida (gene xylL). - BiphenyI-2,3-dihydro-2,3-diol dehydrogenase (EC 1.3.1.- 
) (gene bphB) from various Pseudomonaceae. - Cis-toluene dihydrodiol dehydrogenase (EC 
1.3.1.-) from Pseudomonas putida (gene todD). - Cis-benzene glycol dehydrogenase (EC 

25 1.3.1.19 ) from Pseudomonas putida (gene bnzE). - 2,3-dihydro-2,3-dihydroxybenzoate 

dehydrogenase (EC 1.3.1.28 ) from Escherichia coli (gene entA) and Bacillus subtilis (gene 
dhbA). - Dihydropteridine reductase (EC 1.6.99.7 ) (HDHPR) from mammals. - Lignin 
degradation enzyme ligD from Pseudomonas paucimobilis. - Agropine synthesis reductase 
from Agrobacterium plasmids (gene masl). - Versicolorin reductase from Aspergillus 

30 parasiticus (gene VERl). - Putative keto-acyl reductases from Streptomyces polyketide 
biosynthesis operons. - A trifunctional hydratase-dehydrogenase-epimerase from the 
peroxisomal beta-oxidation system of Candida tropicalis. This protein contains two tandemly 
repeated 'short-chain dehydrogenase-type' domain in its N-terminal extremity. - Nodulation 
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protein nodG from species of Azospirillum and Rhizobium which is probably involved in the 
modification of the nodulation Nod factor fatty acyl chain. - Nitrogen fixation protein fixR 
from Bradyihizobium japonicum. - Bacillus subtilis protein dltE which is involved in the 
biosynthesis of D- alanyl-lipoteichoic acid. - Human follicular variant translocation protein 1 
(FVTl). - Mouse adipocyte protein p27. - Mouse protein Ke 6. - Maize sex determination 
protein TASSELSEED 2. - Sarcophaga peregrina 25 Kd development specific protein. - 
Drosophila fat body protein P6. - A Listeria monocytogenes hypothetical protein encoded in 
the internalins gene region. - Escherichia coli hypothetical protein yciK. - Escherichia coli 
hypothetical protein ydfG. - Escherichia coli hypothetical protein yjgl. - Escherichia coli 
hypothetical protein yjgU. - Escherichia coli hypothetical protein yohP. - Bacillus subtilis 
hypothetical protein yoxD. - Bacillus subtilis hypothetical protein ywfD. - Bacillus subtilis 
hypothetical protein ywfH. - Yeast hypothetical protein YIL124w. - Yeast hypothetical 
protein YIR035c. - Yeast hypothetical protein YIR036c. - Yeast hypothetical protein 
YKL055C. - Fission yeast hypothetical protein SpAC23D3.11. One of the best conserved 
regions which includes two perfectly conserved residues, a tyrosine and a lysine has been 
used as a signature pattern for this family of proteins. The tyrosine residue participates in the 
catalytic mechanism. 

Consensus pattern: [LIVSPADNK]-x(12)-Y-[PSTAGNCV]-[STAGNQCIVM]-[STAGC]-K- 
{PC}-[SAGFYR]-[LIVMSTAGD]-x(2)-[LIVMFYW]-x(3)- [LIVMFYWGAPTHQ]- 
[GSACQRHM] [Y is an active site residue] 

[ 1] Joernvall H., Persson B., Krook M., Atrian S., Gonzalez-Duarte R., Jeffery J., Ghosh D. 
Biochemistry 34:6003-6013(1995). 

[ 2] Villarroya A., Juan E., Egestad B., Joernvall H. Eur. J. Biochem. 180:191-197(1989). 
[ 3] Persson B., Krook M., Joernvall H. Eur. J. Biochem. 200:537-543(1991). 
[ 4] Neidle E.L., Hartnett C., Ornston N.L., Bairoch A., Rekik M., Harayama S. Eur. J. 
Biochem. 204:113-120(1992). 

46. (adh_zinc) Zinc-containing alcohol dehydrogenases signatures 

Alcohol dehydrogenase (EC 1.1.1.1 ) (ADH) catalyzes the reversible oxidation of ethanol to 
acetaldehyde with the concomitant reduction of NAD [1]. Currently three, structurally and 
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catalytically, different types of alcohol dehydrogenases are known: - Zinc-containing 'long- 
chain' alcohol dehydrogenases. - Insect-type, or 'short-chain' alcohol dehydrogenases. - Iron- 
containing alcohol dehydrogenases.Zinc-containing ADH's [2,3] are dimeric or tetrameric 
enzymes that bind two atoms of zinc per subunit. One of the zinc atom is essential for 
5 catalytic activity while the other is not. Both zinc atoms are coordinated by either cysteine or 
histidine residues; the catalytic zinc is coordinated by two cysteines and one histidine. Zinc- 
containing ADH's are found in bacteria, mammals, plants, and in fungi. In most species there 
are more than one isozyme (for example, human have at least six isozymes, yeast have three, 
etc.). A number of other zinc-dependent dehydrogenases are closely related to zinc ADH [4], 

1 0 these are: - Xylitol dehydrogenase (EC 1.1.1.9 ) (D-xylulose reductase). - Sorbitol 

dehydrogenase (EC 1.1.1.14 ). - Aryl-alcohol dehydrogenase (EC 1.1.1.90 ) (benzyl alcohol 
dehydrogenase). - Threonine 3-dehydrogenase (EC 1.1.1.103 ). - Cinnamyl-alcohol 
dehydrogenase (EC 1.1.1.195 ) (CAD) [5]. CAD is a plant enzyme involved in the 
biosynthesis of lignin. - Galactitol-1 -phosphate dehydrogenase (EC 1.1.1.251 ). - 

15 Pseudomonas putida 5-exo-alcohol dehydrogenase (EC 1.1.1.-) [6]. - Escherichia coli 

starvation sensing protein rspB. - Escherichia coli hypothetical protein yjgB. - Escherichia 
coli hypothetical protein yjgV. - Escherichia coli hypothetical protein yjjN. - Yeast 
hypothetical protein YAL060w (FUN49). - Yeast hypothetical protein YAL061w (FUN50). - 
Yeast hypothetical protein YCRlOSw. The pattern that has been developed to detect this class 

20 of enzymes is based on a conserved region that includes a histidine residue which is the 

second ligand of the catalytic zinc atom. This family also includes NADP-dependent quinone 
oxidoreductase (EC 1.6.5.5 ),an enzyme found in bacteria (gene qor), in yeast and in 
mammals where, in some species such as rodents, it has been recruited as an eye lens protein 
and is known as zeta-crystallin [7]. The sequence of quinone oxidoreductase is distantly 

2 5 related to that other zinc-containing alcohol dehydrogenases and it lacks the zinc-ligand 

residues. The torpedo fish and mammlian synaptic vesicle membrane protein vat-1 is related 
to qor. A specific pattern has been developed for this subfamily. 

Consensus pattern: G-H-E-x(2)-G-x(5)-[GA]-x(2)-[IVSAC] [H is a zinc ligand] 

3 0 Consensus pattern: [GSD]-[DEQH]-x(2)-L-x(3)-[SA](2)-G-G-x-G-x(4)-Q-x(2)-[KR]- 

[ 1] Branden C.-I., Joernvall H., Eklund H., Furugren B. (In) The Enzymes (3rd edition) 
11:104-190(1975). 
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[ 2] Joernvall H., Persson B., Jeffery J. Eur. J. Biochem. 167:195-201(1987). 
[ 3] Sun H.-W., Plapp B.V. J. Mol. Evol. 34:522-535(1992). 

[ 4] Persson B., Hallborn J., Walfridsson M., Hahn-Haegerdal B., Keraenen S., Penttilae M., 
Joernvall H. FEES Lett. 324:9-14(1993). 
5 [ 5] Knight M.E., Halpin C, Schuch W. Plant Mol. Biol. 19:793-801(1992). 

[ 6] Koga H., Aramaki H., Yamaguchi E., Takeuchi K., Horiuchi T., Gunsalus I.C. J. 
Bacteriol. 166:1089-1095(1986). 

[ 7] Joernvall H., Persson B., Du Bois G., Lavers G.C., Chen J.H., Gonzalez P., Rao P.V., 
Zigler J.S. Jr. FEBS Lett. 322:240-244(1993). 

10 

47. (aldedh) Aldehyde dehydrogenases active sites 

Aldehyde dehydrogenases (EC 1.2.1.3 and EC 1.2.1.5 > are enzymes which oxidize a wide 
variety of aliphatic and aromatic aldehydes. In mammals at least four different forms of the 

1 5 enzyme are known [1]: class-1 (or Aid C) a tetrameric cytosolic enzyme, class-2 (or Aid M) a 
tetrameric mitochondrial enzyme, class-3 (or Aid D) a dimeric cytosolic enzyme, and class 
IV a microsomal enzyme. Aldehyde dehydrogenases have also been sequenced from fungal 
and bacterial species. A number of enzymes are known to be evolutionary related to aldehyde 
dehydrogenases; these enzymes are listed below. - Plants and bacterial betaine-aldehyde 

2 0 dehydrogenase (EC 1.2.1.8 ) [2], an enzyme that catalyzes the last step in the biosynthesis of 
betaine. - Plants and bacterial NADP-dependent glyceraldehyde-3 -phosphate dehydrogenase 
(EC 1.2.1.9 ). - Escherichia coli succinate-semialdehyde dehydrogenase (NADP+) (EC 
1.2.1.16 ) (gene gabD) [3], which reduces succinate semialdehyde into succinate. - 
Escherichia coli lactaldehyde dehydrogenase (EC 1.2.1 .22 ) (gene aid) [4]. - Mammalian 

2 5 succinate semialdehyde dehydrogenase (NAD+) (EC 1.2.1.24 '). - Escherichia coli 

phenylacetaldehyde dehydrogenase (EC 1.2.1.39 ). - Escherichia coli 5-carboxymethyl-2- 
hydroxymuconate semialdehyde dehydrogenase (gene hpcC). - Pseudomonas putida 2- 
hydroxymuconic semialdehyde dehydrogenase [5] (genes dmpC and xylG), an enzyme in the 
meta-cleavage pathway for the degradation of phenols, cresols and catechol. - Bacterial and 

30 mammalian methylmalonate-semialdehyde dehydrogenase (MMSDH) (EC 1.2.1.27 ) [6], an 
enzyme involved in the distal pathway of valine catabolism. - Yeast delta-l-pyrroline-5- 
carboxylate dehydrogenase (EC 1.5.1.12 ) [7] (gene PUT2), which converts proline to 
glutamate. - Bacterial multifunctional putA protein, which contains a delta-l-pyrroline- 5- 
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carboxylate dehydrogenase domain. - 26G, a garden pea protein of unknown function which 
is induced by dehydration of shoots [8]. - Mammalian formyltetrahydrofolate dehydrogenase 
(EC 1.5.1.6 ) [9]. This is a cytosolic enzyme responsible for the NADP-dependent 
decarboxylative reduction of 10-formyltetrahydrofolate into tetrahydrofolate. It is an protein 
of about 900 amino acids which consist of three domains; the C- terminal domain (480 
residues) is structurally and functionally related to aldehyde dehydrogenases. - Yeast 
hypothetical protein YBR006w. - Yeast hypothetical protein YER073w. - Yeast hypothetical 
protein YHR039c. - Caenorhabditis elegans hypothetical protein F01F1.6.A glutamic acid 
and a cysteine residue have been implicated in the catalytic activity of mammalian aldehyde 
dehydrogenase. These residues are conserved in all the enzymes of this family. Two patterns 
have been derived for this family, one for each of the active site residues. 

Consensus pattern: [LIVMFGA]-E-[LIMSTAC]-[GS]-G-[KNLM]-[SADN]-[TAPFV] [E is 
the active site residue] - 

Consensus pattern: [FYLVA]-x(3)-G-[QE]-x-C-[LIVMGSTANC]-[AGCN]-x- 
[GSTADNEKR] [C is the active site residue 

[ 1] Hempel J., Harper K., Lindahl R. Biochemistry 28:1160-1167(1989). 

[ 2] Weretilnyk E.A., Hanson A.D. Proc. Natl. Acad. Sci. U.S.A. 87:2745-2749(1990). 

[ 3] Niegemann E., Schulz A., Bartsch K. Arch. Microbiol. 160:454-460(1993). 

[ 4] Hidalgo E., Chen Y.-M., Lin E.C.C., Aguilar J. J. Bacteriol. 173:6118-6123(1991). 

[ 5] Nordlund I., Shingler V. Biochim. Biophys. Acta 1049:227-230(1990). 

[ 6] Steele M.I., Lorenz D., Hatter K., Park A., Sokatch J.R. J. Biol. Chem. 267:13585- 

13592(1992). 

[ 7] Krzywicki K.A., Brandriss M.C. Mol. Cell. Biol. 4:2837-2842(1984). 
[ 8] Guerrero F.D., Jones J.T., Mullet J.E. Plant Mol. Biol. 15:11-26(1990). 
[ 9] Cook R.J., Lloyd R.S., Wagner C. J. Biol. Chem. 266:4965-4973(1991). 

48. Aldo/keto reductase family signatures 

The aldo-keto reductase family [1.2] groups together a number of structurally and 
functionally related NADPH-dependent oxidoreductases as well as some other proteins. The 
proteins known to belong to this family are: - Aldehyde reductase (EC 1.1.1.2 ). - Aldose 
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reductase (EC 1.1.1.21 ). - 3-alpha-hydroxysteroid dehydrogenase (EC LLL50), which 
terminates androgen action by converting 5-alpha-dihydrotestosterone to 3-alpha- 
androstanediol. - Prostaglandin F synthase (EC 1.1.1.188) which catalyzes the reduction of 
prostaglandins H2 and D2 to F2-alpha. - D-sorbitol-6-phosphate dehydrogenase (EC 
1.1.1.200 ) from apple. - Morphine 6-dehydrogenase (EC 1.1.1.218 ) from Pseudomonas 
putida plasmid pMDH7.2 (gene morA). - Chlordecone reductase (EC 1.1.1.225 ) which 
reduces the pesticide chlordecone (kepone) to the corresponding alcohol. - 2,5-diketo-D- 
gluconic acid reductase (EC 1.1.1.-) which catalyzes the reduction of 2,5-diketogluconic acid 
to 2-keto-L-gulonic acid, a key intermediate in the production of ascorbic acid. - NAD(P)H- 
dependent xylose reductase (EC 1.1.1.-) from the yeast Pichia stipitis. This enzyme reduces 
xylose into xylit. - Trans-l,2-dihydrobenzene-l,2-diol dehydrogenase (EC 1.3.1. 20 ). - 3-oxo- 
5-beta-steroid 4-dehydrogenase (EC 1.3.99.6 ) which catalyzes the reduction of delta(4)-3- 
oxosteroids. - A soybean reductase, which co-acts with chalcone synthase in the formation of 
4,2',4'-trihydroxychalcone. - Frog eye lens rho crystallin. - Yeast GCY protein, whose 
function is not known. - Leishmania major PllO/llE protein. PllO/llE is a developmentally 
regulated protein whose abundance is markedly elevated in promastigotes compared with 
amastigotes. Its exact function is not yet known. - Escherichia coli hypothetical protein yafB. 
- Escherichia coli hypothetical protein yghE. - Yeast hypothetical protein YBR149w. - Yeast 
hypothetical protein YHR104w. - Yeast hypothetical protein YJR096w.These proteins have 
all about 300 amino acid residues. Three consensus patterns have been developed that are 
specific to this family of proteins. The first one is located in the N-terminal section of these 
proteins. The second pattern is located in the central section. The third pattern, located in the 
C-terminal, is centered on a lysine residue whose chemical modification, in aldose and 
aldehydereductases, affect the catalytic efficiency. 

Consensus pattern: G-[FY]-R-[HSAL]-[LIVMF]-D-[STAGC]-[AS]-x(5)-E-x(2)-[LIVM]- G - 
Consensus pattern: [LIVMFY]-x(9)-[KREQ]-x-[LIVM]-G-[LIVM]-[SC]-N-[FY]- 
Consensus pattern: [LIVM]-[PAIV]-[KR]-[ST]-x(4)-R-x(2)-[GSTAEQK]-[NSL]-x(2)- 
[LIVMFA] [K is a putative active site residue] - 

[ 1] Bohren K.M., Bullock B., Wermuth B., Gabbay K.H. J. Biol. Chem. 264:9547- 
9551(1989). 

[ 2] Bruce N.C., Willey D.L., Coulson A.F.W., Jeffery J. Biochem. J. 299:805-811(1994). 
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49. Alpha amylase. This family is classified as family 13 of the glycosyl hydrolases. The 
structure is an 8 stranded alpha/beta barrel, interrupted by a -70 a.a. calcium-binding domain 
protruding between beta strand 3 and alpha helix 3, and a carboxyl-terminal Greek key beta- 
barrel domain. 

[1] Larson SB, Greenwood A, Cascio D, Day J, McPherson A, J Mol Biol 1994;235:1560- 
1584. 

50. Aminotransferases class-I pyridoxal-phosphate attachment site 
Aminotransferases share certain mechanistic features with other pyridoxal- phosphate 
dependent enzymes, such as the covalent binding of the pyridoxal- phosphate group to a 
lysine residue. On the basis of sequence similarity, these various enzymes can be grouped 
[1,2] into subfamilies. One of these, called class-I, currently consists of the following 
enzymes: - Aspartate aminotransferase (AAT) (EC 2.6.1.1 '). AAT catalyzes the reversible 
transfer of the amino group from L-aspartate to 2-oxoglutarate to form oxaloacetate and L- 
glutamate. In eukaryotes, there are two AAT isozymes: one is located in the mitochondrial 
matrix, the second is cytoplasmic. In prokaryotes, only one form of AAT is found (gene 
aspC). - Tyrosine aminotransferase (EC 2.6. l.S ') which catalyzes the first step in tyrosine 
catabolism by reversibly transferring its amino group to 2- oxoglutarate to form 4- 
hydroxyphenylpyruvate and L-glutamate. - Aromatic aminotransferase (EC 2.6.1.57 ) 
involved in the synthesis of Phe, Tyr, Asp and Leu (gene tyrB). - 1-aminocyclopropane-l- 
carboxylate synthase (EC 4.4.1.14 ) (ACC synthase) from plants. ACC synthase catalyzes the 
first step in ethylene biosynthesis. - Pseudomonas denitrificans cobC, which is involved in 
cobalamin biosynthesis. - Yeast hypothetical protein YJL060w.The sequence around the 
pyridoxal-phosphate attachment site of this class of enzyme is sufficiently conserved to allow 
the creation of a specific pattern. 



Consensus pattern: [GS]-[LIVMFYTAC]-[GSTA]-K-x(2)-[GSALVN]-[LIVMFA]-x- 
[GNAR]- x-R-[LIVMA]-[GA] [K is the pyridoxal-P attachment site] 
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[ 1] Bairoch A. Unpublished observations (1992). 

[ 2] Sung M.H., Tanizawa K., Tanaka H., Kuramitsu S., Kagamiyama H., Hirotsu K., 
Okamoto A., Higuchi T., Soda K. J. Biol. Cham. 266:2567-2572(1991). 

51, Aminotransferases class-II pyridoxal-phosphate attachment site 
Aminotransferases share certain mechanistic features with other pyridoxal- phosphate 
dependent enzymes, such as the covalent binding of the pyridoxal- phosphate group to a 
lysine residue. On the basis of sequence similarity, these various enzymes can be grouped [1] 
into subfamilies. One of these, called class-II, currently consists of the following enzymes: - 
Glycine acetyltransferase (EC 2.3.1.29 ), which catalyzes the addition of acetyl-CoA to 
glycine to form 2-amino-3-oxobutanoate (gene kbl). - 5-aminolevulinic acid synthase (EC 
2.3.1.37 ) (delta-ALA synthase), which catalyzes the first step in heme biosynthesis via the 
Shemin (or C4) pathway, i.e. the addition of succinyl-CoA to glycine to form 5- 
aminolevulinate. - 8-amino-7-oxononanoate synthase (EC 2.3.1.47 ) (7-KAP synthetase), a 
bacterial enzyme (gene bioF) which catalyzes an intermediate step in the biosynthesis of 
biotin: the addition of 6-carboxy-hexanoyl-CoA to alanine to form 8-amino-7-oxononanoate. 
- Histidinol-phosphate aminotransferase (EC 2.6.1.9 ), which catalyzes the eighth step in 
histidine biosynthetic pathway: the transfer of an amino group from 3-(imidazol-4-yl)-2- 
oxopropyl phosphate to glutamic acid to form histidinol phosphate and 2-oxoglutarate. - 
Serine palmitoyltransferase (EC 2.3.1.50 ) from yeast (genes LCBl and LCB2), which 
catalyzes the condensation of palmitoyl-CoA and serine to form 3- ketosphinganine.The 
sequence around the pyridoxal-phosphate attachment site of this class of enzyme is 
sufficiently conserved to allow the creation of a specific pattern 

Consensus pattern: T-[LIVMFYW]-[STAG]-K-[SAG]-[LIVMFYWR]-[SAG]-x(2)-[SAG] 
[K is the pyridoxal-P attachment site]- 

[ 1] Bairoch A. Unpublished observations (1991). 

52. Aminotransferases class-Ill pyridoxal-phosphate attachment site 



Reference No. 2750-942P 



97 

Aminotransferases share certain meclianistic features with other pyridoxal- phosphate 
dependent enzymes, such as the covalent binding of the pyridoxal- phosphate group to a 
lysine residue. On the basis of sequence similarity, these various enzymes can be grouped 
[1,2] into subfamilies. One of these, called class-Ill, currently consists of the following 
5 enzymes: - Acetylornithine aminotransferase (EC 2.6.1.11 ) which catalyzes the transfer of an 
amino group from acetylornithine to alpha- ketoglutarate, yielding N-acetyl-glutamic-5 -semi- 
aldehyde and glutamic acid. - Ornithine aminotransferase (EC 2.6.1.13 ), which catalyzes the 
transfer of an amino group from ornithine to alpha-ketoglutarate, yielding glutamic-5- semi- 
aldehyde and glutamic acid. - Omega-amino acid— pyruvate aminotransferase (EC 2.6.1.18 ), 

1 0 which catalyzes transamination between a variety of omega-amino acids, mono- and 
diamines, and pyruvate. It plays a pivotal role in omega amino acids metabolism. - 4- 
aminobutyrate aminotransferase (EC 2.6.1.19 ) (GABA transaminase), which catalyzes the 
transfer of an amino group from GABA to alpha-ketoglutarate, yielding succinate 
semialdehyde and glutamic acid. - DAPA aminotransferase (EC 2.6.1.62 ), a bacterial enzyme 

1 5 (gene bioA) which catalyzes an intermediate step in the biosynthesis of biotin, the 

transamination of 7-keto-8-aminopelargonic acid (7-KAP) to form 7,8- diaminopelargonic 
acid (DAPA). - 2,2-dialkylglycine decarboxylase (EC 4.1.1.64 ), a Pseudomonas cepacia 
enzyme (gene dgdA) that catalyzes the decarboxylating amino transfer of 2,2-dialkylglycine 
and pyruvate to dialkyl ketone, alanine and carbon dioxide. - Glutamate-l-semialdehyde 

2 0 aminotransferase (EC 5.4.3.8 ) (GSA). GSA is the enzyme involved in the second step of 
porphyrin biosynthesis, via the C5 pathway. It transfers the amino group on carbon 2 of 
glutamate-1- semialdehyde to the neighbouring carbon, to give delta-aminolevulinic acid. - 
Bacillus subtilis aminotransferase yhxA. - Bacillus subtilis aminotransferase yodT. - 
Haemophilus influenzae aminotransferase HI0949. - Caenorhabditis elegans aminotransferase 

2 5 T01B11.2.The sequence around the pyridoxal-phosphate attachment site of this class 

ofenzyme is sufficiently conserved to allow the creation of a specific pattern. 

Consensus pattern: [LIVMFYWC](2)-x-D-E-[IVA]-x(2)-G-[LIVMFAGC]-x(0,l)- 
[RSACLI]-x-[GSAD]-x(12,16)-D-[LIVMFC]-[LIVMFYSTA]-x(2)- [GSA]-K-x(3)- 

3 0 [GSTADNV]-[GSAC] [K is the pyridoxal-P attachment site]- 

[ 1] Bairoch A. Unpublished observations (1992). [ 2] Yonaha K., Nishie M., Aibara S. J. 
Biol. Chem. 267:12506-12510(1992). 
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53. Ank repeat. There's no clear separation between noise and signal on the HMM search 
Ankyrin repeats generally consist of a beta, alpha, alpha, beta order of secondary structures. 
The repeats associate to form a higher order structure. 

[1] A, Holak TA, FEES Lett 1997;401:127-132. 

[2] Lux SE, John KM, Bennett V, Nature 1990;345:736-739. 



54, Aminotransferases class-IV signature 

Aminotransferases share certain mechanistic features with other pyridoxal-phosphate 
dependent enzymes, such as the covalent binding of the pyridoxal-phosphate group to a 
lysine residue. On the basis of sequence similarity, these various enzymes can be grouped 
[1,2] into subfamilies. One of these, called class-IV, currently consists of the following 
enzymes: 

- Branched-chain amino-acid aminotransferase (EC 2.6.1.42 ) (transaminase B), a 
bacterial (gene ilvE) and eukaryotic enzyme which catalyzes the reversible 
transfer of an amino group from 4-methyl-2-oxopentanoate to glutamate, to form 
leucine and 2-oxoglutarate. 

- D-alanine aminotransferase (EC 2.6.1.21 ). A bacterial enzyme which catalyzes the 
transfer of the amino group from D-alanine (and other D-amino acids) to 2- 
oxoglutarate, to form pyruvate and D-aspartate. 

- 4-amino-4-deoxychorismate (ADC) lyase (gene pabC). A bacterial enzyme that 
converts ADC into 4-aminobenzoate (PABA) and pyruvate. 

The above enzymes are proteins of about 270 to 415 amino-acid residues that share a 
few regions of sequence similarity. Surprisingly, the best-conserved region does not include 
the lysine residue to which the pyridoxal-phosphategroup is known to be attached, in ilvE. 
The region that has been selected as a signature pattern is located some 40 residues at the C- 
terminus side of the PlP-lysine 



Consensus pattern: E-x-[STAGCI]-x(2)-N-[LIVMFAC]-[FY]-x(6,12)-[LIVMF]-x-T- x(6,8)- 
[LIVM]-x-[GS]-[LIVM]-x-[KR]- 
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[1] Green J.M., Merkel W.K., Nichols B.P. J. Bacteriol. 174:5317-5323(1992). 
[2] Bairoch A. Unpublished observations (1992). 

55. Aminotransferases class- V pyridoxal-phosphate attachment site 
Aminotransferases share certain mechanistic features with other pyridoxal- phosphate 
dependent enzymes, such as the covalent binding of the pyridoxal- phosphate group to a 
lysine residue. On the basis of sequence similarity, these various enzymes can be grouped 
[1,2] into subfamilies. One of these, called class-V, currently consists of the following 
enzymes: - Phosphoserine aminotransferase (EC 2.6.1.52 ), an enzyme which catalyzes the 
reversible interconversion of phosphoserine and 2-oxoglutarate to 3-phosphonooxypyruvate 
and glutamate. It is required both in the major phosphorylated pathway of serine biosynthesis 
and in pyridoxine biosynthesis. The bacterial enzyme (gene serC) is highly similar to a rabbit 
endometrial progesterone-induced protein (EPIP), which is probably a phosphoserine 
aminotransferase [3]. - Serine-glyoxylate aminotransferase (EC 2.6.1.45 ) (SCAT) (gene 
sgaA) from Methylobacterium extorquens. - Serine-pyruvate aminotransferase (EC 
2.6.1.51 ). This enzyme also acts as an alanine-glyoxylate aminotransferase (EC 2.6. 1.44 ). In 
vertebrates, it is located in the peroxisomes and/or mitochondria. - Isopenicillin N epimerase 
(gene cefD). This enzyme is involved in the biosynthesis of cephalosporin antibiotics and 
catalyzes the reversible isomerization of isopenicillin N and penicillin N. - NifS, a protein of 
the nitrogen fixation operon of some bacteria and cyanobacteria. The exact function of nifS is 
not yet known. A highly similar protein has been found in fungi (gene NFSl or SPLl). - The 
small subunit of cyanobacterial soluble hydrogenase (EC 1.12.-.-). - Hypothetical protein 
ycbU from Bacillus subtilis. - Hypothetical protein YFL030w from yeast. The sequence 
around the pyridoxal-phosphate attachment site of this class of enzyme is sufficiently 
conserved to allow the creation of a specific pattern. 

Consensus pattern: [LIVFYCHT]-[DGH]-[LIVMFYAC]-[LIVMFYA]-x(2)-[GSTAC]- 
[GSTA]- [HQR]-K-x(4,6)-G-x-[GSAT]-x-[LIVMFYSAC] [K is the pyridoxal-P attachment 
site]- 

[ 1] Ouzounis C, Sander C. FEES Lett. 322:159-164(1993). 
[ 2] Bairoch A. Unpublished observations (1992). 
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[ 3] van der Zel A., Lam H.-M., Winkler M.E. Nucleic Acids Res. 17:8379-8379(1989). 
56. Annexins repeated domain signature 

Annexins [1 to 6] are a group of calcium-binding proteins that associate reversibly with 
membranes. They bind to phospholipid bilayers in the presence of micromolar free calcium 
concentration. The binding is specific for calcium and for acidic phospholipids. Annexins 
have been claimed to be involved in cytoskeletal interactions, phospholipase inhibition, 
intracellular signalling, anticoagulation, and membrane fusion. Each of these proteins consist 
of an N-terminal domain of variable length followed by four or eight copies of a conserved 
segment of sixty one residues. The repeat (sometimes known as an 'endonexin fold') consists 
of five alpha-helices that are wound into a right-handed superhelix [7] .The proteins known to 
belong to the annexin family are listed below: - Annexin I (Lipocortin 1) (Calpactin 2) (p35) 
(Chromobindin 9). - Annexin II (Lipocortin 2) (Calpactin 1) (Protein I) (p36) (Chromobindin 
8). - Annexin III (Lipocortin 3) (PAP-III). - Annexin IV (Lipocortin 4) (Endonexin I) (Protein 
II) (Chromobindin 4). - Annexin V (Lipocortin 5) (Endonexin 2) (VAC-alpha) (Anchorin 
CII) (PAP-I). - Annexin VI (Lipocortin 6) (Protein III) (Chromobindin 20) (p68) (p70). This 
is the only known annexin that contains 8 (instead of 4) repeats. - Annexin VII (Synexin). - 
Annexin VIII (Vascular anticoagulant-beta) (VAC-beta). - Annexin IX from Drosophila. - 
Annexin X from Drosophila. - Annexin XI (Calcyclin-associated annexin) (CAP-50). - 
Annexin XII from Hydra vulgaris. - Annexin XIII (Intestine-specific annexin) (ISA).The 
signature pattern for this domain spans positions 9 to 61 of the repeatand includes the only 
perfectly conserved residue (an arginine in position 22)- 

Consensus pattern: [TG]-[STV]-x(8)-[LIVMF]-x(2)-R-x(3)-[DEQNH]-x(7)-[IFY]- x(7)- 
[LIVMF]-x(3)-[LIVMF]-x(ll)-[LIVMFA]-x(2)-[LIVMF]- 

[ 1] Raynal P., Pollard H.B. Biochim. Biophys. Acta 1197:63-93(1994). 

[ 2] Barton G.J., Newman R.H., Freemont P.S., Crumpton M.J. Eur. J. Biochem. 198:749- 

760(1991). 

[ 3] Burgoyne R.D., Geisow M.J. Cell Calcium 10:1-10(1989). 

[ 4] Haigler H.T., Fitch J.M., Jones J.M., Schlaepfer D.D. Trends Biochem. Sci. 14:48- 
50(1989). 
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[ 5] Klee C.B. Biochemistry 27:6645-6653(1988). 

[ 6] Smith P.D., Moss S.E. Trends Genet. 10:241-246(1994). 

[ 7] Huber R., Roemisch J., Paques E.-P. EMBO J. 9:3867-3874(1990). 

[ 8] Fiedler K., Simons K. Trends Biochem. Sci. 20:177-178(1995). 

5 

57. (arf_l) ADP-ribosylation factors family signature 

ADP-ribosylation factors (ARF) [1,2,3,4] are 20 Kd GTP -binding proteins involved in 
protein trafficking. They may modulate vesicle budding and uncoating within the Golgi 

1 0 apparatus. ARF's also act as allosteric activators of cholera toxin ADP-ribosyltransferase 

activity. They are evolutionary conserved and present in all eukaryotes. At least six forms of 
ARF are present in mammals and three in budding yeast. The ARF family also includes 
proteins highly related to ARF's but which lack the cholera toxin cofactor activity, they are 
collectively known as ARL's (ARF-like).ARDl is a 64 Kd mammalian protein of unknown 

1 5 biological function that contains an ARF domain at its C-terminal extremity. Proteins from 
the ARF family are generally included in the RAS 'superfamily' of small GTP -binding 
proteins [5], but they are only slightly related to the other RAS proteins. They also differ 
from RAS proteins in that they lack cysteine residues at their C-termini and are therefore not 
subject to prenylation. The ARFs are N-terminally myristoylated (the ARLs have not yet 

2 0 been shown to be modified in such a fashion). A conserved region in the C-terminal part of 
ARF's and ARL's has been selected as a signature pattern. 

Consensus pattern: [HRQT]-x-[FYWI]-x-[LIVM]-x(4)-A-x(2)-G-x(2)-[LIVM]-x(2)- [GSA]- 
[LIVMF]-x-[WK]-[LIVM]- 
25 Note: proteins belonging to this family also contain a copy of the ATP/GTP- binding motif 
'A (P-Ioop) (see < PDOC00017 

[ 1] Boman A.L., Kahn R.A. Trends Biochem. Sci. 20:147-150(1995). 
[ 2] Moss J., Vaughan M. Cell. Signal. 4.367-399(1993). 
30 [3] Moss J., Vaughan M. Prog. Nucleic Acid Res. Mol. Biol. 45:47-65(1993). 

[ 4] Amor J.C., Harrison D.H., Kahn R.A., Ringe D. Nature 372:704-708(1994). 

[ 5] Valencia A., Chardin P., Wittinghofer A., Sander C. Biochemistry 30:4637-4648(1991). 
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(arf_2) ATP/GTP-binding site motif A (P-loop) 

From sequence comparisons and crystallographic data analysis it has been shown 
[1,2,3,4,5,6] that an appreciable proportion of proteins that bind ATP or GTP share a number 
of more or less conserved sequence motifs. The best conserved of these motifs is a glycine- 
rich region, which typically forms a flexible loop between a beta-strand and an alpha-helix. 
This loop interacts with one of the phosphate groups of the nucleotide. This sequence motif is 
generally referred to as the 'A' consensus sequence [1] or the T-loop' [5]. There are numerous 
ATP- or GTP -binding proteins in which the P-loop is found. A number of protein families for 
which the relevance of the presence of such motif has been noted are listed below: - ATP 
synthase alpha and beta subunits (see <PDOC00137>). - Myosin heavy chains. - Kinesin 
heavy chains and kinesin-like proteins (see <PDOC00343>). - Dynamins and dynamin-like 
proteins (see <PDOC00362>). - Guanylate kinase (see <PDOC00670>). - Thymidine kinase 
(see <PDOC00524>). - Thymidylate kinase (see <PDOC01034>)- - Shikimate kinase (see 
<PDOC00868>). - Nitrogenase iron protein family (nifH/frxC) (see <PDOC00580>). - ATP- 
binding proteins involved in 'active transport' (ABC transporters) [7] (see <PDOC00185>). - 
DNA and RNA helicases [8,9,10]. - GTP -binding elongation factors (EF-Tu, EF-lalpha, EF- 
G, EF-2, etc.). - Ras family of GTP-binding proteins (Ras, Rho, Rab, Ral, Yptl, SEC4, etc.). 
- Nuclear protein ran (see <PDOC00859>). - ADP-ribosylation factors family (see 
<PDOC0078i>). - Bacterial dnaA protein (see <PDOC00771>)- - Bacterial recA protein (see 
<PDOC00131>). - Bacterial recF protein (see <PDOC00539>). - Guanine nucleotide-binding 
proteins alpha subunits (Gi, Gs, Gt, GO, etc.). - DNA mismatch repair proteins mutS family 
(See < PDQC00388 >). - Bacterial type II secretion system protein E (see <PDOC00567>).Not 
all ATP- or GTP-binding proteins are picked-up by this motif. A number of proteins escape 
detection because the structure of their ATP -binding site is completely different from that of 
the P-loop. Examples of such proteins are the E1-E2 ATPases or the glycolytic kinases. In 
other ATP- or GTP-binding proteins the flexible loop exists in a slightly different form; this 
is the case for tubulins or protein kinases. A special mention must be reserved for adenylate 
kinase, in which there is a single deviation from the P-loop pattern: in the last position Gly is 
found instead of Ser or Thr. 

Consensus pattern: [AG]-x(4)-G-K-[ST]- 

[ 1] Walker I.E., Saraste M., Runswick M.J., Gay N.J. EMBO J. 1:945-951(1982). 
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[ 2] MoUer W., Amons R. FEES Lett. 186:1-7(1985). 

[ 3] Fry D.C., Kuby S.A., Mildvan A.S. Proc. Natl. Acad. Sci. U.S.A. 83:907-911(1986). 
[ 4] Dever T.E., Glynias M.J., Merrick W.C. Proc. Natl. Acad. Sci. U.S.A. 84:1814- 
1818(1987). 

[ 5] Saraste M., Sibbald P.R., Wittinghofer A. Trends Biocliem. Sci. 15:430-434(1990). 
[ 6] Koonin E.V. J. Mol. Biol. 229:1165-1174(1993). 

[ 7] Higgins C.F., Hyde S.C., Mimmack M.M., Gileadi U., Gill D.R., Gallagher M.P. J. 
Bioenerg. Biomembr. 22:571-592(1990), 

[ 8] Hodgman T.C. Nature 333:22-23(1988) and Nature 333:578-578(1988) (Errata). 

[ 9] Under P., Lasko P., Ashburner M., Leroy P., Nielsen P.J., Nishi K., Schnier J., Slonimski 

P.P. Nature 337:121-122(1989). 

[10] Gorbalenya A.E., Koonin E.V., Donchenko A.P., Blinov V.M. Nucleic Acids Res. 
17:4713-4730(1989). 

58. Arginase family signatures 

The following enzymes have been shown [1] to be evolutionary related: - Arginase (EC 
3.5.3.1 ), a ubiquitous enzyme which catalyzes the degradation of arginine to ornithine and 
urea [2]. - Agmatinase (EC 3.5.3.11 ) (agmatine ureohydrolase), a prokaryotic enzyme (gene 
speB) that catalyzes the hydrolysis of agmatine into putrescine and urea. - 
Formiminoglutamase (EC 3.5.3.8 ) (formiminoglutamate hydrolase), a prokaryotic enzyme 
(gene hutG) that hydrolyzes N-formimino-glutamate into glutamate and formamide. - 
Hypothetical proteins from methanogenic archaebacteria. These enzymes are proteins of 
about 300 amino-acid residues. Three conserved regions that contain charged residues which 
are involved in the binding of the two manganese ions [3] can be used as signature patterns. - 

Consensus pattern: [LIVMF]-G-G-x-H-x-[LIVMT]-[STAV]-x-[PAG]-x(3)-[GSTA] [H binds 
manganese] - 

Consensus pattern: [LIVM](2)-x-[LIVMFY]-D-[AS]-H-x-D [The two D's and the H bind 
manganese] - 

Consensus pattern: [ST]-[LIVMFY]-D-[LIVM]-D-x(3)-[PAQ]-x(3)-P-[GSA]-x(7)-G [The 
two D's bind manganese] 



Reference No. 2750-942P 



104 

[ 1] Ouzounis C, Kyrpides N.C. J. Mol. Evol. 39:101-104(1994). 

[ 2] Jenkinson CP., Grody W.W., Cederbaum S.D. Comp. Biochem. Physiol. 114B:107- 
132(196). 

[ 3] Kanyo Z.F., Scolnick L.R., Ash D.E., Christianson D.W. Nature 383:554-557(1996). 
59. (asp) Eukaryotic and viral aspartyl proteases active site 

Aspartyl proteases, also known as acid proteases, (EC 3.4.23.-) are a widely distributed 
family of proteolytic enzymes [1,2,3] known to exist invertebrates, fungi, plants, retroviruses 
and some plant viruses. Aspartate proteases of eukaryotes are monomeric enzymes which 
consist of two domains. Each domain contains an active site centered on a catalytic aspartyl 
residue.The two domains most probably evolved from the duplication of an ancestral gene 
encoding a primordial domain. Currently known eukaryotic aspartyl proteases are: - 
Vertebrate gastric pepsins A and C (also known as gastricsin). - Vertebrate chymosin 
(rennin), involved in digestion and used for making cheese. - Vertebrate lysosomal cathepsins 
D (EC 3.4.23.5 > and E (EC 3.4.23.34 Y - Mammalian renin (EC 3.4.23.15 ) whose function is 
to generate angiotensin I from angiotensinogen in the plasma. - Fungal proteases such as 
aspergillopepsin A (EC 3.4.23.18 ), candidapepsin (EC 3.4.23.24 \ mucoropepsin (EC 
3.4.23.23 ^ (mucor rennin), endothiapepsin (EC 3.4.23.22 ), polyporopepsin (EC 3.4.23.29 ), 
and rhizopuspepsin (EC 3.4.23.21 ). - Yeast saccharopepsin (EC 3.4.23.25 ) (proteinase A) 
(gene PEP4). PEP4 is implicated in posttranslational regulation of vacuolar hydrolases. - 
Yeast barrier pepsin (EC 3.4.23.35 ) (gene BARl); a protease that cleaves alpha-factor and 
thus acts as an antagonist of the mating pheromone. - Fission yeast sxal which is involved in 
degrading or processing the mating pheromones. Most retroviruses and some plant viruses, 
such as badnaviruses, encode for anaspartyl protease which is an homodimer of a chain of 
about 95 to 125 amino acids. In most retroviruses, the protease is encoded as a segment of 
apolyprotein which is cleaved during the maturation process of the virus. It is generally part 
of the pol polyprotein and, more rarely, of the gagpolyprotein. Conservation of the sequence 
around the two aspartates of eukaryotic aspartyl proteases and around the single active site of 
the viral proteases allows us to develop a single signature pattern for both groups of protease. 



Consensus pattern: [LIVMFGAC]-[LIVMTADN]-[LIVFSA]-D-[ST]-G-[STAV]- 
[STAPDENQ]- x-[LIVMFSTNC]-x-[LIVMFGTA] [D is the active site residue] 
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Note: these proteins belong to families Al and A2 in the classification of peptidases [4,E1 

[ 1] Foltmann B. Essays Biochem. 17:52-84(1981). 

[ 2] Davies D.R. Annu. Rev. Biophys. Chem. 19:189-215(1990). 

[ 3] Rao J.K.M., Erickson J.W., Wlodawer A. Biochemistry 30:4663-4671(1991). 

[ 4] Rawlings N.D., Barrett A.J. Meth. Enzymol. 248:105-120(1995). 



60. (BIRA) Biotin repressor 

[1] Wilson KP, Shewchuk LM, Brennan RG, Otsuka AJ, Matthews BW; Proc Natl Acad Sci 
USA 1992;89:9257-9261. 

61. BTB/POZ domain 

The BTB (for BR-C, ttk and bab) [1] or POZ (for Pox virus and Zinc finger)[2] domain is 
present near the N-terminus of a fraction of zinc finger 
r zf-C2H2 ) proteins and in proteins that contain the Kelch motif 
such as Kelch and a family of pox virus proteins. The BTB/POZ domain mediates 
homomeric dimerisation and in some instances heteromeric dimerisation [2] .The structure of 
the dimerised PLZF BTB/POZ domain has been solved and consists of a tightly intertwined 
homodimer. The central scaffolding of the protein is made up of a cluster of alpha-helices 
flanked by short beta-sheets at both the top and bottom of the molecule [3]. POZ domains 
from several zinc finger proteins have been shown to mediate transcriptional repression and 
to interact with components of histone deacetylase co-repressor complexes including N-CoR 
and SMRT [4,5,6]. The POZ or BTB domain is also known as BR-C/Ttk or ZiN 

[1] ZoUman S, Godt D, Prive GG, Couderc JL, Laski FA; Proc Natl Acad Sci U S A 
1994;91:10717-10721. 

[2]Bardwell VJ, Treisman R; Genes Dev 1994;8:1664-1677. 

[3] Ahmad KF, Engel CK, Prive GG; Proc Natl Acad Sci U S A 1998;95:12123-12128. 

[4] Deweindt C, Albagli O, Bernardin F, Dhordain P, Quief S, 

Lantoine D, Kerckaert JP, Leprince D; Cell Growth Differ 1995;6:1495-1503. 

[5] Huynh KD, Bardwell VJ; Oncogene 1998;17:2473-2484. 
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[6] Wong CW, Privalsky ML; J Biol Chem 1998;273:27695-27702. 

62. (Bac GSPproteins) Bacterial type II secretion system protein D signature 
A number of bacterial proteins, some of which are involved in a general secretion pathway 
(GSP) for the export of proteins (also called the type II pathway) [1 to 5], have been found to 
be evolutionary related. These proteins are listed below: - The 'D' protein from the GSP 
operon of: Aeromonas (gene exeD); Erwinia (gene outD); Escherichia coli (gene yheF), 
Klebsiella pneumoniae (gene pulD); Pseudomonas aeruginosa (gene xcpQ); Vibrio cholerae 
(gene epsD) and Xanthomonas campestris (gene xpsD). - comE from Haemophilus 
influenzae, involved in competence (DNA uptake). - pilQ from Pseudomonas aeruginosa, 
which is essential for the formation of the pili. - hofQ (hopQ) from Escherichia coli. - hrpH 
from Pseudomonas syringae, which is involved in the secretion of a proteinaceous elicitor of 
the hypersensitivity response in plants. - hrpAl from Xanthomonas campestris pv. 
vesicatoria, which is also involved in the hypersensitivity response. - mxiD from Shigella 
flexneri which is involved in the secretion of the Ipa invasins which are necessary for 
penetration of intestinal epithelial cells. - omc from Neisseria gonorrhoeae. - yssC from 
Yersinia enterocolitica virulence plasmid pYV, which seems to be required for the export of 
the Yop virulence proteins. - The gpIV protein from filamentous phages such as fl, ike, or 
ml3. GpIV is said to be involved in phage assembly and morphogenesis. These proteins all 
seem to start with a signal sequence and are thought to be integral proteins in the outer 
membrane. As a signature pattern a conserved region in the C-terminal section of these 
proteins has been selected 

Consensus pattern: [GR]-[DEQKG]-[STVM]-[LIVMA](3)-[GA]-G-[LIVMFY]-x(ll)- 
[LIVM]-P-[LIVMFYWGS]-[LIVMF]-[GSAE]-x-[LIVM]-P- [LIVMFYW](2)-x(2)-[LV]-F 

[ 1] Salmond G.P.C., Reeves P.J. Trends Biochem. Sci. 18:7-12(1993). 

[ 2] Reeves P.J., Whitcombe D., Wharam S., Gibson M., Allison G., Bunce N., Barallon R., 

Douglas P., Mulholland V., Stevens S., Walker S., Salmond G.P.C. Mol. Microbiol. 8:443- 

456(1993). 

[ 3] Martin P.R., Hobbs M., Free P.D., Jeske Y., Mattick J.S. Mol. Microbiol. 9:857- 
868(1993). 
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[ 4] Hobbs M., Mattick J.S. Mol. Microbiol. 10:233-243(1993). 
[ 5] Genin S., Boucher C.A. Mol. Gen. Genet. 243:112-118(1994). 

63. (Bac globin) Protozoan/cyanobacterial globins signature 

Globins are heme-containing proteins involved in binding and/or transporting oxygen [1], 
Almost all globins belong to a large family (see <PDOC00793>), the only exceptions are the 
following proteins which form a family of their own[2,3]: - Monomeric hemoglobins from 
the protozoan Paramecium caudatum, Tetrahymena pyriformis and Tetrahymena 
thermophila. - Cyanoglobin from the cyanobacteria Nostoc commune. - Globins LI637 and 
LI410 from the chloroplast of the alga Chlamydomonas eugametos. - Mycobacterium 
tuberculosis hypothetical protein MtCY48.23.These proteins contain a conserved histidine 
which could be involved in heme-binding. As a signature pattern, a conserved region that 
ends with this residue was used 

Consensus pattern: F-[LF]-x(5)-G-[PA]-x(4)-G-[KRA]-x-[LIVM]-x(3)-H- 

[ 1] Concise Encyclopedia Biochemistry, Second Edition, Walter de Gruyter, Berlin New- 
York (1988). 

[ 2] Takagi T. Curr. Opin. Struct. Biol. 3:413-418(1993). 

[ 3] Couture M., Chamberland H., St-Pierre B., Lafontaine J., Guertin M.; Mol. Gen. Genet. 
243:185-197(1994). 

64. Band 7 protein family signature 

Mammalian band 7 protein [1] (also known as 7.2B or stomatin) is an integral membrane 
phosphoprotein of red blood cells thought to regulate cation conductance by interacting with 
other proteins of the junctional complex of the membrane skeleton. Structurally, band 7 is 
evolutionary related to the following proteins: - Caenorhabditis elegans protein mec-2 [2]. 
Mec-2 positively regulates the activity of the putative mechanosensory transduction channel. 
It may links the mechanosensory channel and the microtubule cytoskeleton of the touch 
receptor neurons. - Caenorhabditis elegans proteins sto-1 to sto-4. - Caenorhabditis elegans 
protein unc-1. - Escherichia coli hypothetical protein ybbK. - Mycobacterium tuberculosis 
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hypothetical protein MtCY277.09. - Synechocystis strain PCC 6803 hypothetical protein 
slrll28. - Methanococcus jannaschii hypothetical protein MJ0827.Structurally all these 
proteins consist of a short N-terminal domain which is followed by a transmembrane region 
and a variable size (from 170 to 350residues) C-terminal domain .As a signature pattern, a 
5 conserved region located about llOresidues after the transmembrane domain was selected 

Consensus pattern: R-x(2)-[LIV]-[SAN]-x(6)-[LIV]-D-x(2)-T-x(2)-W-G-[LIV]- [KRH]- 
[LIV]-x-[KR]-[LIV]-E-[LIV]-[KR]- 

10 [1] Gallagher P.G., Forget B.G. J. Biol. Chem. 270:26358-26363(1995). 

[ 2] Huang M., Gu G., Ferguson E.L., Chalfie M. Nature 378:292-295(1995). 



65. Barwin domain signatures 
15 Barwin [1] is a barley seed protein of 125 residues that binds weakly a chitinanalog. It 
contains six cysteines involved in disulfide bonds, as shown in the following schematic 
representation. 



xxxxxxxxxxxxxxxCxxxxxxxxxxCxxxxCxCxxxxxxxxCxxxxxxxxxxxxxxxxxxCx 1 1 1 1 + 

2 0 + + +'C': conserved cysteine involved in a disulfide bond.'*': 

position of the patterns. Barwin is closely related to the following proteins: - Hevein, a 
wound-induced protein found in the latex of rubber trees. - HEL, an Arabidopsis thaliana 
hevein-like protein [2]. - Winl and win2, two wound-induced proteins from potato. - 
Pathogenesis-related protein 4 from tobacco. Hevein and the winl/2 proteins consist of an N- 
25 terminal chitin-binding domain followed by a barwin-like C-terminal domain. Barwin and its 
related proteins could be involved in a defense mechanism in plants. As signature patterns, 
two highly conserved regions that contain some of the cysteines were selected 

Consensus pattern: C-G-[K[R]-C-L-x-V-x-N [The two C's are involved in disulfide bonds]- 

3 0 Consensus pattern: V-[DN]-Y-[EQ]-F-V-[DN]-C [C is involved in a disulfide bond]- 

[ 1] Svensson B., Svendsen I., Hoejrup P., Roepstorff P., Ludvigsen S., Poulsen F.M. 
Biochemistry 31:8767-8770(1992). 
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[ 2] Potter S., Uknes S., Lawton K., Winter A.M., Chandler D., Dimaio J., Novitzky R., Ward 
E., Ryals J. Mol. Plant Microbe Interact. 6:680-685(1993). 



5 66. (Bowman-Birk leg) Bowman-Birk serine protease inhibitors family signature 

PROSITE cross-reference(s). The Bowman-Birk inhibitor family [1] is one of the numerous 
families of serine proteinase inhibitors. As it can be seen in the schematic representation, they 
have a duplicated structure and generally possess two distinct inhibitory sites: 



I + + + + + + I 

I I I I M I I 

xxCCxxCxxCxx#xxCxxCxxxxCxxxCxxxCxxxxCxx#xxCxxCxxCxxCxx 

I j |**...*..|..*. I I 

II II II 

+..| + + + I 



< 70 residues > 

20 

'C: conserved cysteine involved in a disulfide bond. 
'#': active site residue. 
'*': position of the pattern. 



2 5 These inhibitors are found in the seeds of all leguminous plants as well as in 
cereal grains. In cereals they exist in two forms, one of which is a 
duplication of the basic structure shown above [2]. The pattern that was developed 
to pick up sequences belonging to this family of inhibitors is in the central 
part of the domain and includes four cysteines. 

30 

Consensus pattern C-x(5,6)-[DENQKRHSTA]-C-[PASTDH]-[PASTDK]-[ASTDV]-C- 
[NDKS]-[DEKRHSTA]-C [The four Cs are involved in disulfide bonds] Note this pattern 
can be found twice in some duplicated cereal inhibitors. 
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[ 1] Laskowski M., Kato I. Annu. Rev. Biochem. 49:593-626(1980). 

[ 2] Tashiro M., Hashino K., Shiozaki M., Ibuki R, Maki Z. J. Biochem. 102:297-306(1987). 

67. Pathogenesis-related protein Bet v I family signature 

A number of plant proteins, which all seem to be involved in pathogen defense 
response, are structurally related [1,2,3]. These proteins are: 

- Bet V I, the major pollen allergen from white birch. Bet v I is the main cause of 
type I allergic reactions in Europe, North America and USSR. 

- Aln g I, the major pollen allergen from alder. 

- Api G I, the major allergen from celery. 

Car b I, the major pollen allergen from hornbeam. 
Cor a I, the major pollen allergen from hazel. 

- Mai d I, the major pollen allergen from apple. 

- Asparagus wound-induced protein AoPRl. 
Kidney bean pathogenesis-related proteins 1 and 2. 

- Parsley pathogenesis-related proteins PRl-1 and PRl-3. 

- Pea disease resistance response proteins pI49, pI176 and DRRG49-C. 
Pea abscisic acid-responsive proteins ABR17 and ABR18. 

Potato pathogenesis-related proteins STH-2 and STH-21. 

Soybean stress-induced protein SAM22. 
These proteins are thought to be intracellularly located. They contain from 155 to 160 
amino acid residues. As a signature pattern, a conserved region located in the third quarter of 
these proteins has been selected 

Consensus pattern: G-x(2)-[LIVMF]-x(4)-E-x(2)-[CSTAEN]-x(8,9)-[GND]-G-[GS]- [CS]- 
x(2)-K-x(4)-[FY]- 

[1] Breiteneder H., Pettenburger K., Bito A., Valenta R., Kraft D., Rumpold H., Scheiner O., 
Breitenbach M. EMBO J. 8:1935-1938(1989). 

[2] Crowell D., John M.E., Russell D., Amasino R.M. Plant Mol. Biol. 18:459-466(1992). 
[3] Warner S.A.J. , Scott R., Draper J. Plant Mol. Biol. 19:555-561(1992). 



Reference No. 2750-942P 



111 

68. bZIP transcription factors basic domain signature 

The bZIP superfamily [1,2,] of eukaryotic DNA-binding transcription factors groups together 
proteins that contain a basic region mediating sequence-specific DNA-binding followed by a 
leucine zipper required for dimerization. This family is quite large, therefore only a parital list 
of some representative members appears here. - Transcription factor AP-1, which binds 
selectively to enhancer elements in the cis control regions of SV40 and metallothionein IIA. 
AP-1, also known as c-jun, is the cellular homolog of the avian sarcoma virus 17 (ASV17) 
oncogene v-jun. - Jun-B and jun-D, probable transcription factors which are highly similar to 
jun/AP-1. - The fos protein, a proto-oncogene that forms a non-covalent dimer with c-jun. - 
The fos-related proteins fra-1, and fos B. - Mammalian cAMP response element (CRE) 
binding proteins CREB, CREM, ATF-1, ATF-3, ATF-4, ATF-5, ATF-6 and LRF-1. - Maize 
Opaque 2, a trans-acting transcriptional activator involved in the regulation of the production 
of zein proteins during endosperm. - Arabidopsis G-box binding factors GBFl to GBF4, 
Parsley CPRF-1 to CPRF-3, Tobacco TAF-1 and wheat EMBP-1. All these proteins bind the 
G-box promoter elements of many plant genes. - Drosophila protein Giant, which represses 
the expression of both the kruppel and knirps segmentation gap genes. - Drosophila Box B 
binding factor 2 (BBF-2), a transcriptional activator that binds to fat body-specific enhancers 
of alcohol dehydrogenase and yolk protein genes. - Drosophila segmentation protein 
cap'n'collar (gene cnc), which is involved in head morphogenesis. - Caenorhabditis elegans 
skn-1, a developmental protein involved in the fate of ventral blastomeres in the early 
embryo. - Yeast GCN4 transcription factor, a component of the general control system that 
regulates the expression of amino acid-synthesizing enzymes in response to amino acid 
starvation, and the related Neurospora crassa cpc-1 protein. - Neurospora crassa cys-3 which 
turns on the expression of structural genes which encode sulfur-catabolic enzymes. - Yeast 
MET28, a transcriptional activator of sulfur amino acids metabolism. - Yeast PDR4 (or 
YAPl), a transcriptional activator of the genes for some oxygen detoxification enzymes. - 
Epstein-Barr virus trans-activator protein BZLFl.- 

Consensus pattern: [KR]-x(l,3)-[RKSAQ]-N-x(2)-[SAQ](2)-x-[RKTAENQ]-x-R-x-[RK]- 

[ 1] Hurst H.C. Protein Prof. 2:105-168(1995).[ 2] Ellenberger T. Curr. Opin. Struct. Biol. 
4:12-21(1994). 
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69. Biotin-requiring enzymes attachment site 

Biotin, which plays a catalytic role in some carboxyl transfer reactions, is 
covalently attached, via an amide bond, to a lysine residue in enzymes 
requiring this coenzyme [1,2,3,4]. Such enzymes are: 

- Pyruvate carboxylase (EC 6.4.1.1). 

- Acetyl-CoA carboxylase (EC 6.4.1.2). 

- Propionyl-CoA carboxylase (EC 6.4.1.3). 

- Methylcrotonoyl-CoA carboxylase (EC 6.4.1.4). 

- Geranoyl-CoA carboxylase (EC 6.4.1.5). 

- Urea carboxylase (EC 6.3.4.6). 

- Oxaloacetate decarboxylase (EC 4.1.1.3). 

- Methylmalonyl-CoA decarboxylase (EC 4.1.1.41). 

- Glutaconyl-CoA decarboxylase (EC 4.1.1.70). 

- Methylmalonyl-CoA carboxyl-transf erase (EC 2.1.3.1) (transcarboxylase). 
Sequence data reveal that the region around the biocytin (biotin-lysine) 
residue is well conserved and can be used as a signature pattern. 

Consensus pattern[GN]-[DEQTR]-x-[LIVMFY]-x(2)-[LIVM]-x-[AIV]-M-K-[LMAT]-x(3)- 
[LIVM]-x-[SAV] [K is the biotin attachment site] Note the domain around the biotin-binding 
lysine residue is evolutionary related to that around the lipoyl-binding lysine residue of 2-oxo 
acid dehydrogenase acyltransferases 

[ 1] Knowles J.R. Annu. Rev. Biochem. 58:195-221(1989). 

[ 2] Samols D., Thronton C.G., Murtif V.L., Kumar G.K., Haase F.C., Wood H.G. J. Biol. 
Chem. 263:6461-6464(1988). 

[ 3] Goss N.H., Wood H.G. Meth. Enzymol. 107:261-278(1984). 

[ 4] Shenoy B.C., Xie Y., Park V.L., Kumar G.K., Beegen H., Wood H.G., Samols D. J. Biol. 
Chem. 267:18407-18412(1992). 

2-oxo acid dehydrogenases acyltransferase component lipoyl binding site 

The 2-oxo acid dehydrogenase multienzyme complexes [1,2] from bacterial and 
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eukaryotic sources catalyze the oxidative decarboxylation of 2-oxo acids to 
the corresponding acyl-CoA. The three members of this family of multienzyme 
complexes are: 

- Pyruvate dehydrogenase complex (PDC). 

- 2-oxoglutarate dehydrogenase complex (OGDC). 

- Branched-chain 2-oxo acid dehydrogenase complex (BCOADC). 

These three complexes share a common architecture: they are composed of 
multiple copies of three component enzymes - El, E2 and E3. El is a thiamine 
pyrophosphate-dependent 2-oxo acid dehydrogenase, E2 a dihydrolipamide 
acyl transferase, and E3 an FAD-containing dihydrolipamide dehydrogenase. 
E2 acyltransferases have an essential cofactor, lipoic acid, which is 
covalently bound via a amide linkage to a lysine group. The E2 components of 
OGCD and BCOACD bind a single lipoyl group, while those of PDC bind either one 
(in yeast and in Bacillus), two (in mammals), or three (in Azotobacter and in 
Escherichia coli) lipoyl groups [3]. 

In addition to the E2 components of the three enzymatic complexes described 
above, a lipoic acid cofactor is also found in the following proteins: 

- H-protein of the glycine cleavage system (GCS) [4]. GCS is a multienzyme 
complex of four protein components, which catalyzes the degradation of 
glycine. H protein shuttles the methylaraine group of glycine from the P 
protein to the T protein. H-protein from either prokaryotes or eukaryotes 
binds a single lipoic group. 

- Mammalian and yeast pyruvate dehydrogenase complexes differ from that of 
other sources, in that they contain, in small amounts, a protein of unknown 
function - designated protein X or component X. Its sequence is closely 
related to that of E2 subunits and seems to bind a lipoic group [5]. 

- Fast migrating protein (FMP) (gene acoC) from Alcaligenes eutrophus [6]. 
This protein is most probably a dihydrolipamide acyltransferase involved in 
acetoin metabolism. 

A signature pattern was developed which allows the detection of the lipoyl- 
binding site. 
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Consensus pattern[GN]-x(2)-[LIVF]-x(5)-[LIVFC]-x(2)-[LIVFA]-x(3)-K-[STAIV]- 
[STAVQDN]-x(2)-[LIVMFS]-x(5)-[GCN]-x-[LIVMFY] [K is the lipoyl-binding site] Note 
the domain around the lipoyl-binding lysine residue is evolutionary related to that around the 
biotin-binding lysine residue of biotin requiring enzymes 

[ 1] Yeaman S.J. Biochem. J. 257:625-632(1989). 

[ 2] Yeaman S.J. Trends Biochem. Sci. 11:293-296(1986). 

[ 3] Russel G.C., Guest J.R. Biochim. Biophys. Acta 1076:225-232(1991). 

[ 4] Fujiwara K., Okamura-Ikeda K., Motokawa Y. J. Biol. Chem. 261:8836-8841(1986). 

[ 5] Behal R.H., Browning K.S., Hall T.B., Reed L.J. Proc. Natl. Acad. Sci. U.S.A. 86:8732- 

8736(1989). 

[ 6] Priefert H., Hein S., Krueger N., Zeh K., Schmidt B., Steinbuechel A. J. Bacterid. 
173:4056-4071(1991). 

70. C2 (C2 domain) Number of members: 295 

Some isozymes of protein kinase C (PKC) [1,2] contain a domain, known as C2, of about 
116 amino-acid residues which is located between the two copies of the CI domain (that 
bind phorbol esters and diacylglycerol) (see <PDOC00379>) and the protein kinase 
catalytic domain (see <PDOC00100>). Regions with significant homology [3,E1] to the 
C2-domain have been found in the following proteins: 

- PKC isoforms alpha, beta and gamma and Drosophila isoforms PKCl and PKC2. 

- PKC isoforms delta, epsilon and eta, Caenorhabditis elegans kin- 13 and yeast PKCl 
have a C2-like domain at the N-terminal extremity [4]. 

- Yeast cAMP dependent protein kinase SCH9 contains a C2-like domain. 

- Mammalian phosphatidylinositol-specific phospholipase C (PI-PLC) (see <PDOC50007>) 
isoforms beta, gamma and delta as well as several non-mammalian PI-PLCs have a C2-like 
domain C-terminal of the catalytic domain. 

- Mammalian and plants phosphatidylinositol-3-kinase have a C2-like domain in the central 
region of the 110 Kd catalytic subunit. 

- Yeast phosphatidylserine-decarboxylase 2 (gene PSD2) contains a C2 domain in its central 
region. 
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- Cytosolic phospholipase D from plants and cytosolic phospholipase A2 have a C2-like 
domain at their N-terminus. 

- Synaptotagmins (p65). This is a family of related synaptic vesicle proteins that bind acidic 
phospholipids and that may have a regulatory role in the membrane interactions during 
trafficking of synaptic vesicles at the active zone of the synapse. All isoforms of 
synaptotagmins have two copies of the C2 domain in their C-terminal region. 

- Rabphilin-3A, a synaptic protein contains two C2 domains. 

- Caenorhabditis elegans protein unc-13 whose function is not known. Unc-13 has a C2 
domain in its central part and a C2-like domain at the C-terminus. 

- rasGAP and the breakpoint cluster protein bcr have a C2-domain C-terminal of a PH- 
domain. 

- Yeast protein BUD2 (or CLA2) has a C2-domain in the central region. 

- Yeast protein RSP5 and human protein NEDD-4, both proteins also contain WW domains 
(see <PDOC50020>). 

-Perforin (see <PDOC00251>) has a C2 domain at the C-terminus. It is the only 
extracellular protein known to contain a C2 domain. 

- Yeast hypothetical protein YML072C has a C2 domain. 

- Yeast hypothetical protein YNL087W has three C2 domains. 

- Caenorhabditis elegans hypothetical protein F37A4.7 has two C2 domains. 

The C2 domain is thought to be involved in calcium-dependent phospholipid binding [5]. 
Since domains related to the C2 domain are also found in proteins that do not bind calcium, 
other putative functions for the C2 domain like e.g. binding to inositol-l,3,4,5-tetraphosphate 
have been suggested [6]. Recently, the 3D structure of the first C2 domain of 
synaptotagmin has been reported [7], the domain forms an eight-stranded beta sandwich. The 
signature pattern that has been developed for the C2 domain is located in a conserved part of 
that domain, the connecting loop between beta strands 2 and 3. A profile has been 
developed for the C2 domain that covers the total domain. 

-Consensus pattern: [ACG]-x(2)-L-x(2,3)-D-x(l,2)-[NGSTLIF]-[GTMR]-x-[STAP]-D-[PA]- 
[FY] 

-Note: this documentation entry is linked to both a signature pattern and a profile. As the 
profile is much more sensitive than the pattern, you should use it if you have access to the 
necessary software tools to do so. 
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[l]Medline: 96367095 Extending the C2 domain family: C2s in PKCs delta, epsilon, eta and 
theta, phospholipases, GAPs and perforin. Ponting CP, Parker PJ; Protein Sci 1996;5:162- 
166. 

[ 1] Azzi A., Boscoboinik D., Hensey C. Eur. J. Biochem. 208:547-557(1992). 
[ 2] Stabel S. Semin. Cancer Biol. 5:277-284(1994). 

[ 3] Brose N., Hofmann K.O., Hata Y., Suedhof T.C. J. Biol. Chem. 270:25273-25280(1995). 

[ 4] Sossin W.S., Schwartz J.H. Trends Biochem. Sci. 18:207-208(1993). 

[ 5] Davletov B.A., Suedhof T.C. J. Biol. Chem. 268:26386-26390(1993). 

[ 6] Fukuda M., Aruga J., Niinobe M., Aimoto S., Mikoshiba K. J. Biol. Chem. 269:29206- 

29211(1994). 

[ 6] Sutton R.B., Davletov B.A., Berghuis A.M., Suedhof T.C, Sprang S.R. Cell 80:929- 
938(1995). 

71. CAP (CAP protein) Number of members: 11 

In budding and fission yeasts the CAP protein is a bifunctional protein whose N-terminal 
domain binds to adenylyl cyclase, thereby enabling that enzyme to be activated by upstream 
regulatory signals, such as Ras. The function of the C-terminal domain is less clear, but it is 
required for normal cellular morphology and growth control [1]. CAP is conserved in 
higher eukaryotic organisms where its function is not yet clear [2]. 

Structurally, CAP is a protein of 474 to 551 residues which consist of two domains separated 
by a proline-rich hinge. Two signature patterns, one corresponding to a conserved region in 
the N-terminal extremity and the other to a C-terminal region have been developed. 

-Consensus pattern: [LIVM](2)-x-R-L-[DE]-x(4)-R-L-E 

-Consensus pattern: D-[LIVMFY]-x-E-x-[PA]-x-P-E-Q-[LIVMFY]-K 

[ 1] Kawamukai M., Gerst J., Field J., Riggs M., Rodgers L., Wigler M., Young D. Mol. Biol. 
Cell 3:167-180(1992). 

[ 2] Yu G., Swiston J., Young D. J. Cell Sci. 107:1671-1678(1994). 
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72. CAP_GLY (CAP-Gly domain) 

CAP stands for cyto skeleton -associated proteins. Swiss:P39937 may be a member but has not 
been included. It has a weak match to the family between residues 22-67. Number of 
members: 24 

[IJMedline: 93242656. Sequence homologies between four cytoskeleton-associated proteins. 
Riehemann K, Sorg C; Trends Biochem Sci 1993;18:82-83. 

It has been shown [1] that some cytoskeleton-associated proteins (CAP) share the presence 
of a conserved, glycine-rich domain of about 42 residues, called here CAP-Gly. Proteins 
known to contain this domain are listed below. 

- Restin (also known as cytoplasmic linker protein-170 or CLIP-170), a 160 Kd protein 
associated with intermediate filaments and that links endocytic vesicles to microtubules. 
Restin contains two copies of the CAP-Gly domain. 

- Vertebrate dynactin (150 Kd dynein-associated polypeptide; DAP) and Drosophila 
glued, a major component of activator I, a 20S polypeptide complex that stimulates 
dynein-mediated vesicle transport. 

- Yeast protein BIKl which seems to be required for the formation or stabilization of 
microtubules during mitosis and for spindle pole body fusion during conjugation. 

- Yeast protein NIPlOO (NIP80). 

-Human protein CKAPl/TFCB, Schizosaccharomyces pombe protein alpll and 
Caenorhabditis elegans hypothetical protein F53F4.3. These proteins contain a N-terminal 
ubiquitin domain (see <PDOC00271>) and a C-terminal CAP-Gly domain. 

- Caenorhabditis elegans hypothetical protein M01A8.2. 

- Yeast hypothetical protein YNL148c. 

Structurally, these proteins are made of three distinct parts: an N-terminal section that is 
most probably globular and contains the CAP-Gly domain, a large central region 
predicted to be in an alpha-helical coiled-coil conformation and, finally, a short C- 
terminal globular domain. The signature for the CAP-Gly domain corresponds to the first 32 
residues of the domain and includes five of the six conserved glycines. 

-Consensus pattern: G-x(8,10)-[FYW]-x-G-[LIVM]-x-[LIVMFY]-x(4)-G-K-[NH]-x-G- 
[STAR]-x(2)-G-x(2)-[LY]-F 
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[ 1] Riehemann K., Sorg C. Trends Biochem. Sci. 18:82-83(1993). 
73. (CBD 1) 

Cellulose-binding domain, fungal type 

The microbial degradation of cellulose and xylans requires several types of enzymes such as 
endoglucanases (EC 3.2.1.4), cellobiohydrolases (EC 3.2.1.91) (exoglucanases), or xylanases 
(EC 3.2.1.8) [1]. 

Structurally, cellulases and xylanases generally consist of a catalytic domain joined to a 
cellulose-binding domain (CBD) by a short linker sequence rich in proline and/or hydroxy- 
amino acids. 

The CBD of a number of fungal cellulases has been shown to consist of 36 amino acid 
residues. Enzymes known to contain such a domain are: 

- Endoglucanase I (gene egU) from Trichoderma reesei. 

- Endoglucanase II (gene egl2) from Trichoderma reesei. 

- Endoglucanase V (gene egl5) from Trichoderma reesei. 

- Exocellobiohydrolase I (gene CBHI) from Humicola grisea, Neurospora crassa, 
Phanerochaete chrysosporium, Trichoderma reesei, and Trichoderma viride. 

- Exocellobiohydrolase II (gene CBHII) from Trichoderma reesei. 

- Exocellobiohydrolase 3 (gene cel3) from Agaricus bisporus 

- Endoglucanases B, C2, F and K from Fusarium oxysporum. 

The CBD domain is found either at the N-terminal (Cbh-II or egl2) or at the C-terminal 
extremity (Cbh-I, egU or egl5) of these enzymes. As it is shown in the following schematic 
representation, there are four conserved cysteines in this type of CBD domain, all involved in 
disulfide bonds. 
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1111 
xxxxxxxCxxxxxxxxxxCxxxxxCxxxxxxxxxCx 

'C: conserved cysteine involved in a disulfide bond, 
position of the pattern. 

Such a domain has also been found in a putative polysaccharide binding protein from the red 
alga, Porphyra purpurea [2]. Structurally, this protein consists of four tandem repeats of the 
CBD domain. 

Consensus patternC-G-G-x(4,7)-G-x(3)-C-x(5)-C-x(3,5)-[NHG]-x-[FYWM]- x(2)-Q-C [The 
four C's are involved in disulfide bonds] Sequences known to belong to this class detected by 
the pattern ALL. 

[ 1] Gilkes N.R., Henrissat B., Kilburn D.G., Miller R.C. Jr., Warren R.A.J. Microbiol. Rev. 
55:303-315(1991). 

[ 2] Liu Q., der Meer J.P., Reith M.E. 

74. CBS domain. 3D Structure found as a subdomain in TIM barrel of inosine-. CBS domain 
web page. CBS domains are small intracellular modules mostly found in 2 or four copies 
within a protein. CBS domains are found in cystathionine -beta-synthase (CBS) where 
mutations lead to homocystinuria. Two CBS domains are found in inosine-monophosphate 
dehydrogenase from all species, however the CBS domains are not needed for activity. Two 
CBS domains are found in intracellular loops of several chloride channels. Mutations in this 
domain of Swiss: P35520 lead to homocystinuria. 
Number of members: 414 

[l]Medline: 97172695 The structure of a domain common to archaebacteria and the 
homocystinuria disease protein. Bateman A; Trends Biochem Sci 1997;22:12-13. 
[2]Medline: 96279836 Structure and mechanism of inosine monophosphate dehydrogenase 
in complex with the immunosuppressant mycophenolic-acid. Sintchak MD, Fleming MA, 
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Futer O, Raybuck SA, Chambers SP, Caron PR, Murcko MA, Wilson KP; Cell 1996;85:921- 
930. 

Discovery of CBS domain. 

[3]Medline: 97259972 CBS domains in CIC chloride channels implicated in myotonia and 
nephrolithiasis (kidney stones). Ponting CP; J Mol Med 1997;75:160-163. 

75. CDP-OH_P_transf (CDP-alcohol phosphatidyltransferase) 

All of these members have the ability to catalyze the displacement of CMP from a CDP- 
alcohol by a second alcohol with formation of a phosphodiester bond and concomitant 
breaking of a phosphoride anhydride bond. Number of members: 32 
A number of phosphatidyltransferases, which are all involved in phospholipid biosynthesis 
and that share the property of catalyzing the displacement of CMP from a CDP-alcohol by a 
second alcohol with formation of a phosphodiester bond and concomitant breaking of a 
phosphoride anhydride bond share a conserved sequence region [1,2]. These enzymes are: 

- Ethanolaminephosphotransferase (EC 2.7.8.1) from yeast (gene EPTl). 

- Diacylglycerol cholinephosphotransferase (EC 2.7.8.2) from yeast (gene CPTl). 

- Phosphatidylglycerophosphate synthase (EC 2.7.8.5) (CDP-diacylglycerol~glycerol-3- 
phosphate 3 -phosphatidyltransferase) from bacteria (gene pgsA). 

- Phosphatidylserine synthase (EC 2.7.8.8) (CDP-diacylglycerol-serine O- 
phosphatidyltransferase) from yeast (gene CHOI) and from Bacillus subtilis (gene pssA). 

- Phosphatidylinositol synthase (EC 2.7.8.11) (CDP-diacylglycerol-inositol 3- 
phosphatidyltransferase) from yeast (gene PIS). 

These enzymes are proteins of from 200 to 400 amino acid residues. The conserved 
region contains three aspartic acid residues and is located in the N-terminal section of the 
sequences. 

-Consensus pattern: D-G-x(2)-A-R-x(8)-G-x(3)-D-x(3)-D 

[l]Medline: 97075020 Two-dimensional IH-NMR of transmembrane peptides from 
Escherichia coli phosphatidylglycerophosphate synthase in micelles. Morein S, Trouard TP, 
Hauksson JB, Rilfors L, Arvidson G, Lindblom G; Eur J Biochem 1996;241:489-497. 
[ 1] Nikawa J. -I., Kodaki T., Yamashita S. 
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J. Biol. Chem. 262:4876-4881(1987). 
[ 2] Hjelmstad R.H., Bell R.M. 
J. Biol. Chem. 266:5094-5134(1991). 

76. CHOD (Cholesterol oxidase) Members of the CMC oxidoreductase family. Number of 
members: 3 

[l]Medline: 94032271. Crystal structure of cholesterol oxidase complexed with a steroid 
substrate: implications for flavin adenine dinucleotide dependent alcohol oxidases. Li J, 
Vrielink A, Brick P, Blow DM; Biochemistry 1993;32:11507-11515. 

The following FAD flavoproteins oxidoreductases have been found [1,2] to be evolutionary 
related. These enzymes, which are called 'GMC oxidoreductases', are listed below. 

- Glucose oxidase (EC 1.1.3.4) (GOX) from Aspergillus niger. Reaction catalyzed: glucose 
+ oxygen -> delta-luconolactone -i- hydrogen peroxide. 

- Methanol oxidase (EC 1.1.3.13) (MOX) from fungi. Reaction catalyzed: methanol + 
oxygen -> acetaldehyde -i- hydrogen peroxide. 

- Choline dehydrogenase (EC 1.1.99.1) (CHD) from bacteria. Reaction catalyzed: choline -i- 
unknown acceptor -> betaine acetaldehyde -i- reduced acceptor. 

- Glucose dehydrogenase (GLD) (EC 1.1.99.10) from Drosophila. Reaction catalyzed: 
glucose -f- unknown acceptor -> delta-gluconolactone + reduced acceptor. 

- Cholesterol oxidase (CHOD) (EC 1.1.3.6) from Brevibacterium sterolicum and 
Streptomyces strain SA-COO. Reaction catalyzed: cholesterol -i- oxygen -> cholest-4-en-3- 
one -I- hydrogen peroxide. 

-AlkJ[3], an alcohol dehydrogenase from Pseudomonas oleovorans, which converts 
aliphatic medium-chain-length alcohols into aldehydes. This family also includes a lyase: 

- (R)-mandelonitrile lyase (EC 4.1.2.10) (hydroxynitrile lyase) from plants [4], an enzyme 
involved in cyanogenis, the release of hydrogen cyanide from injured tissues. 

These enzymes are proteins of size ranging from 556 (CHD) to 664 (MOX) amino acid 
residues which share a number of regions of sequence similarities. One of these regions, 
located in the N-terminal section, corresponds to the FAD ADP- binding domain. The 
function of the other conserved domains is not yet known; two of these domains have been 
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selected as signature patterns. The first one is located in the N-terminal section of these 
enzymes, about 50 residues after the ADP-binding domain, while the second one is 
located in the central section. 

-Consensus pattern: [GA]-[RKN]-x-[LIV]-G(2)-[GST](2)-x-[LIVM]-N-x(3)-[FYWA]- x(2)- 
[PAG]-x(5)-[DNESH] 

-Consensus pattern: [GS] - [PSTA] -x(2)-[ ST] -P-x-[LI VM] (2)-x(2)-S-G- [LIVM] -G 

[ 1] Cavener D.R. J. Mol. Biol. 223:811-814(1992). 

[ 2] Henikoff S., Henikoff J.G. Genomics 19:97-107(1994). 

[ 3] van Beilen J.B., Eggink G., Enequist H., Bos R., Witholt B. Mol. Microbiol. 6:3121- 
3136(1992). 

[ 4] Cheng LP., Poulton J.E. Plant Cell Physiol. 34:1139-1143(1993). 

77. CKS (Cyclin-dependent kinase regulatory subunit) Number of members: 11. Cyclin- 
dependent kinases (CDK) are protein kinases which associate with cyclins to regulate 
eukaryotic cell cycle progression. The most well known CDK is p34-cdc2 (CDC28 in yeast) 
which is required for entry into S-phase and mitosis. CDK's bind to a regulatory subunit 
which is essential for their biological function. This regulatory subunit is a small protein of 
79 to 150 residues. In yeast (gene CKSl) and in fission yeast (gene sucl) a single isoform 
is known, while mammals have two highly related isoforms. It has been shown [1] that 
these CDK regulatory subunits assemble as an hexamer which then acts as a hub for the 
oligomerization of six CDK catalytic subunits. The sequence of CDK regulatory subunits are 
highly conserved therefore, the two most conserved regions have been used as signature 
patterns. 

-Consensus pattern: Y-S-x-[KR]-Y-x-[DE](2)-x-[FY]-E-Y-R-H-V-x-[LV]-[PT]-[KRP] 
-Consensus pattern: H-x-P-E-x-H-[IV]-L-L-F-[KR] 

[ 1] Parge H.E., Arvai A.S., Murtari D.J., Reed S.I., Tainer J.A. Science 262:387-395(1993). 
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78. CK_II_beta (Casein kinase II regulatory subunit) 

Number of members: 16. Casein kinase II (CK-2) [1] is an ubiquitous eukaryotic 
serine/threonine protein kinase which is found both in the cytoplasm and the nucleus and 
whose substrates are numerous. It generally phosphorylates Ser or Thr at the N-terminal 
of stretch of acidic residues (see <PDOC00006>). CK-2 exists as an heterotetramer 
composed of two catalytic subunits (alpha) and two regulatory subunits (beta). In most 
species there are two closely related isoforms of the catalytic subunit: alpha and alpha'. 
Some species, such as fungi and plants, express two forms of regulatory subunits: beta and 
beta'. The exact function of the regulatory subunit is not yet known. It is a highly conserved 
protein of about 25 Kd that contains, in its central section, a cysteine-rich motif that could 
be involved in binding a metal such as zinc [2]. This region has been used as a signature 
pattern. 

-Consensus pattern: C-P-x-[LIVMY]-x-C-x(5)-[LI]-P-[LIVMC]-G-x(9)-V-[KR]-x(2)-C-P-x- 
C 

[ 1] Allende J.E., Allende C.C. FASEB J. 9:313-323(1995). 

[ 2] Reed J.C., Bidwai A.P., Glover C.V.C. J. Biol. Chem. 269:18192-18200(1994). 
79. CLP_protease (Clp protease) 

These proteins belong to family S14 in the classification of peptidases. 

-!- The Clp protease has an active site catalytic triad. In E. coli Clp protease, ser-111, his- 
136 and asp-185 form the catalytic triad. 

-!- Swiss:P48254 has lost all of these active site residues and is therefore inactive. 

-!- Swiss:P42379 contains two large insertions, Swiss:P42380 contains one large insertion. 
Number of members: 38 

The endopeptidase Clp (EC 3.4.21.92) from Escherichia coli cleaves peptides in various 
proteins in a process that requires ATP hydrolysis [1,2]. Clp is a dimeric protein which 
consists of a proteolytic subunit (gene clpP) and either of two related ATP -binding regulatory 
subunits (genes clpA and clpX). ClpP is a serine protease which has a chymotrypsin-like 
activity. Its catalytic activity seems to be provided by a charge relay system similar to that 
of the trypsin family of serine proteases, but which evolved by independent convergent 
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evolution. Proteases highly similar to ClpP have been found to be encoded in the genome of 
the chloroplast of plants and seem to be also present in other eukaryotes. The sequences 
around two of the residues involved in the catalytic triad (a serine and a histidine) are 
highly conserved and can be used as signature patterns specific to that category of 
proteases. 

-Consensus pattern: T-x(2)-[LIVMF]-G-x-A-[SAC]-S-[MSA]-[PAG]-[STA] [S is the active 
site residue] 

-Consensus pattern: R-x(3)-[EAP]-x(3)-[LIVMFYT]-M-[LIVM]-H-Q-P [H is the active site 
residue] 

[l]Medline: 98050920. The structure of ClpP at 2.3 angstroms resolution suggests a model 
for ATP-dependent proteolysis. Wang J, Hartling JA, Flanagan JM; Cell 1997;91:447-456. 
[ 1] Maurizi M.R., Clark W.P., Kim S.-H., Gottesman S. J. Biol. Chem. 265:12546- 
12552(1990). 

[ 2] Gottesman S., Maurizi M.R. Microbiol. Rev. 56:592-621(1992). 
[ 3] Rawlings N.D., Barrett A.J. Meth. Enzymol. 244:19-61(1994). 

80. CNG_membrane (Transmembrane region cyclic Nucleotide Gated Channel) 
[l]Medline: 94224763. Cyclic nucleotide-gated channels: an expanding new family of ion 
channels. Yau KW; Proc Natl Acad Sci USA 1994:91:3481-3483. 

This family is found to the N-terminus of the cNMP_binding. Number of members: 56. 
Proteins that bind cyclic nucleotides (cAMP or cGMP) share a structural domain of about 
120 residues [1-3]. The best studied of these proteins is the prokaryotic catabolite gene 
activator (also known as the cAMP receptor protein) (gene crp) where such a domain 
is known to be composed of three alpha-helices and a distinctive eight-stranded, 
antiparallel beta-barrel structure. Such a domain is known to exist in the following proteins: 

- Prokaryotic catabolite gene activator protein (CAP). 

- cAMP- and cGMP-dependent protein kinases (cAPK and cGPK). Both types of kinases 
contains two tandem copies of the cyclic nucleotide-binding domain. The cAPK's are 
composed of two different subunits: a catalytic chain and a regulatory chain which contains 
both copies of the domain. The cGPK's are single chain enzymes that include the two copies 
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of the domain in their N-terminal section. The nucleotide specificity of cAPK and cGPK is 
due to an amino acid in the conserved region of beta-barrel 7: a threonine that is invariant in 
cGPK is an alanine in most cAPK. 

- Vertebrate cyclic nucleotide-gated ion-channels. Two such cations channels have been 
fully characterized. One is found in rod cells where it plays a role in visual signal 
transduction. It specifically binds to cGMP leading to an opening of the channel and 
thereby causing a depolarization of rod photoreceptors. In olfactory epithelium a similar, 
cAMP-binding, channel plays a role in odorant signal transduction. There are six invariant 
amino acids in this domain, three of which are glycine residues that are thought to be 
essential for maintenance of the structural integrity of the beta-barrel. Two signature 
patterns have been developed for this domain. The first pattern is located within beta-barrels 
and 3 and contains the first two conserved Gly. The second pattern is located within beta- 
barrels 6 and 7 and contains the third conserved Gly as well as the three other invariant 
residues. 

-Consensus pattern: [LIVM]-[VIC]-x(2)-G-[DENQTA]-x-[GAC]-x(2)-[LIVMFY](4)-x(2)-G 
-Consensus pattern: [LIVMF]-G-E-x-[GAS]-[LIVM]-x(5,ll)-R-[STAQ]-A-x-[LIVMA]-x- 
[STACV] 

[ 1] Weber I.T., Shabb J.B., Corbin J.D. Biochemistry 28:6122-6127(1989). 

[ 2] Kaupp U.B. Trends Neurosci. 14:150-157(1991). 

[ 3] Shabb J.B., Corbin J.D. J. Biol. Chem. 267:5723-5726(1992). 

81. COX10_ctaB_cyoE (Cytochrome c oxidase assembly factor) 

[l]Medline: 95191390 
Biosynthesis and functional role of haem O and haem A 
Mogi T, Saiki K, Anraku Y; Mol Microbiol 1994;14:391-398. 
Cytochrome c oxidase is a multi subunit enzyme. The complexity 
of this enzyme requires assistance in building the complex. 
This is carried out by the Cytochrome c oxidase assembly factor. 
Number of members: 31 
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Cytochrome c oxidase is an oligomeric enzymatic complex which seems to require 
the aid of a number of proteins that either act as chaperonins to help the 
subunits of the enzyme to fold correctly, or assist in the assembly of the 
metal centers [1]. One of these subunits is known as COXIO in yeast and as 
ctaB [2] in aerobic prokaryotes. It is evolutionary related to cyoE protein 
from the Escherichia coli cytochrome O terminal oxidase complex. 

These proteins probably contain [3] seven transmembrane segments. The most 
conserved region is located in a loop between the second and third of these 
segments and has been selected as a signature pattern. 

-Consensus pattern: [ED]-x-D-x(2)-M-x-R-T-x(2)-R-x(4)-G 

[ 1] Nobrega M.P., Nobrega E.G., Tzagoloff A. 

J. Biol. Chem. 265:14220-14226(1990). 
[ 2] Cao J., Hosier J., Shapleigh J., Revzin A., Ferguson-Miller S. 

J. Biol. Chem. 267:24273-24278(1992). 
[ 3] Chepuri V., Gennis R.B. 

J. Biol. Chem. 265:12978-12986(1990). 

82. COX3 (Cytochrome c oxidase subunit III) 
This family corresponds to chains c and p. 

[l]Medline: 96216288 
The whole structure of the 13-subunit oxidized cytochrome c 

oxidase at 2.8 A. Tsukihara T, Aoyama H, Yamashita E, Tomizaki T, Yamaguchi H, 
Shinzawa-Itoh K, Nakashima R, Yaono R, Yoshikawa S; Science 1996;272:1136-1144. 
Number of members: 224 

83. COX5B (Cytochrome c oxidase subunit Vb) 
[1] 

Medline: 96216288 



Reference No. 2750-942P 



127 

The whole structure of the 13-subunit oxidized cytochrome c 
oxidase at 2.8 A. 

Tsukihara T, Aoyama H, Yamashita E, Tomizaki T, Yamaguchi H, 
Shinzawa-Itoh K, Nakashima R, Yaono R, Yoshikawa S; 
5 Science 1996;272:1136-1144. 

This family consists of chains F and S 
Number of members: 10 

Cytochrome c oxidase (EC 1.9.3.1) [1] is an oligomeric enzymatic complex which 
10 is a component of the respiratory chain complex and is involved in the 

transfer of electrons from cytochrome c to oxygen. In eukaryotes this enzyme 
complex is located in the mitochondrial inner membrane; in aerobic prokaryotes 
it is found in the plasma membrane. In addition to the three large subunits 
that form the catalytic center of the enzyme complex there are, in eukaryotes, 
15 a variable number of small polypeptidic subunits. One of these subunits which 
is known as Vb in mammals, V in slime mold and IV in yeast, binds a zinc atom. 
The sequence of subunit Vb is well conserved and includes three conserved 
cysteines that are thought to coordinate the zinc ion [2]. Two of these 
cysteines are clustered in the C-terminal section of the subunit; this region has been selected 
2 0 as a signature pattern. 

-Consensus pattern: [LIVM](2)-[FYW]-x(10)-C-x(2)-C-G-x(2)-[FY]-K-L [The two C's 
probably bind zinc] 

25 [1] Capaldi R. A., Malatesta F., Darley-Usmar V.M. 
Biochim. Biophys. Acta 726:135-148(1983). 
[ 2] Rizzuto R., Sandona D., Brini M., Capaldi R.A., Bisson R. 
Biochim. Biophys. Acta 1129:100-104(1991). 

30 

84. COesterase (Carboxylesterases) 
Cholinesterase pages 

The prints entry is specific to acetylcholinesterase 
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Number of members: 273 

Higher eukaryotes have many distinct esterases. Among the different types are 
those which act on carboxylic esters (EC 3.1.1.-). Carboxyl-esterases have 
been classified into three categories (A, B and C) on the basis of 
differential patterns of inhibition by organophosphates. The sequence of a 
number of type-B carboxylesterases indicates [1,2,3] that the majority are 
evolutionary related. This family currently consists of the following 
proteins: 

-Acetylcholinesterase (EC 3.1.1.7) (AChE) [El] from vertebrates and from 
Drosophila. 

- Mammalian cholinesterase II (butyryl cholinesterase) (EC 3.1.1.8). 
Acetylcholinesterase and cholinesterase II are closely related enzymes that 
hydrolyze choline esters [4]. 

- Mammalian liver microsomal carboxylesterases (EC 3.1.1.1). 

- Drosophila esterase 6, produced in the anterior ejaculatory duct of the 
male insect reproductive system where it plays an important role in its 
reproductive biology. 

- Drosophila esterase P. 

- Culex pipiens (mosquito) esterases Bl and B2. 

- Myzus persicae (peach-potato aphid) esterases E4 and FE4. 

- Mammalian bile-salt-activated lipase (BAL) [5], a multifunctional lipase 
which catalyzes fat and vitamin absorption. It is activated by bile salts 

in infant intestine where it helps to digest milk fats. 

- Insect juvenile hormone esterase (JH esterase) (EC 3.1.1.59). 

- Lipases (EC 3.1.1.3) from the fungi Geotrichum candidum and Candida rugosa. 

- Caenorhabditis gut esterase (gene ges-1). 

- Duck fatty acyl-CoA hydrolase, medium chain (EC 3.1.2.14), an enzyme that 
may be associated with peroxisome proliferation and may play a role in the 
production of 3-hydroxy fatty acid diester pheromones. 

- Membrane enclosed crystal proteins from slime mold. These proteins are, 
most probably esterases; the vesicles where they are found have therefore 
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been termed esterosomes. 

So far two bacterial proteins have been found to belong to this family: 

5 - Phenmedipham hydrolase (phenylcarbamate hydrolase), an Arthrobacter oxidans 
plasmid-encoded enzyme (gene pcd) that degrades the phenylcarbamate 
herbicides phenmedipham and desmedipham by hydrolyzing their central 
carbamate linkages. 

- Para-nitrobenzyl esterase from Bacillus subtilis (gene pnbA). 

10 

The following proteins, while having lost their catalytic activity, contain a 
domain evolutionary related to that of carboxylesterases type-B: 

- Thyroglobulin (TG), a glycoprotein specific to the thyroid gland, which is 
1 5 the precursor of the iodinated thyroid hormones thyroxine (T4) and triiodo 

thyronine (T3). 

- Drosophila protein neuractin (gene nrt) which may mediate or modulate cell 
adhesion between embryonic cells during development. 

- Drosophila protein glutactin (gene git), whose function is not known. 

20 

As is the case for lipases and serine proteases, the catalytic apparatus of 
esterases involves three residues (catalytic triad): a serine, a glutamate or 
aspartate and a histidine. The sequence around the active site serine is well 
conserved and can be used as a signature pattern. A conserved region located in 

2 5 the N-terminal section containing a cysteine involved in a disulfide bond 

has been selected as a second signature pattern. 

-Consensus pattern: F-[GR]-G-x(4)-[LIVM]-x-[LIV]-x-G-x-S-[STAG]-G[S is the active site 
residue] 

3 0 -Consensus pattern: [ED]-D-C-L-[YT]-[LIV]-[DNS]-[LIV]-[LIVFYW]-x-[PQR] [C is 

involved in a disulfide bond] 

[ 1] Myers M., Richmond R.C., Oakeshott J.G. Mol. Biol. Evol. 5:113-119(1988). 
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[ 2] Krejci E., Duval N., Chatonnet A., Vincens P., Massoulie J. Proc. Natl. Acad. Sci. U.S.A. 
88:6647-6651(1991). 

[ 3] Cygler M., Schrag J.D., Sussman J.L., Hard M., Silman I. Gentry M.K., Doctor B.P. 

Protein Sci. 2:366-382(1993). 

[ 4] Lockridge O. BioEssays 9:125-128(1988). 

[ 5] Wang C.-S., Hartsuck J. A. Biochim. Biophys. Acta 1166:1-19(1993). 



85. CPSase_L_chain (Carbamoyl-phosphate synthase (CPSase)) 
[1] 

Medline: 94347758 

Three-dimensional structure of the biotin carboxylase subunit. 
of acetyl-CoA carboxylase. 
Waldrop GL, Rayment I, Holden HM; 
Biochemistry 1994;33:10249-10256. 
[1] 

Medline: 90285162 

Mammalian carbamyl phosphate synthetase (CPS). DNA sequence and 
evolution of the CPS domain of the Syrian hamster multifunctional 
protein CAD. 

Simmer JP, Kelly RE, Rinker AG Jr, Scully JL, Evans DR; 

Biol Chem 1990;265:10395-10402. 
Carbamoyl-phosphate synthase catalyzes the ATP-dependent synthesis of 
carbamyl-phosphate from glutamine or ammonia and bicarbonate. This 
important enzyme initiates both the urea cycle and the biosynthesis 
of arginine and/or pyrimidines [2]. 

The carbamoyl-phosphate synthase (CPS) enzyme in prokaryotes is a 
heterodimer of a small and large chain. The small chain promotes 
the hydrolysis of glutamine to ammonia, which is used by the large 
chain to synthesize carbamoyl phosphate. See CPSase_sm_chain. 
The small chain has a GATase domain in the carboxyl terminus. 
See GATase. 
Number of members: 181 
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Carbamoyl-phosphate synthase (CPSase) catalyzes the ATP-dependent synthesis of 
carbamyl-phosphate from glutamine (EC 6.3.5.5) or ammonia (EC 6.3.4.16) and 
bicarbonate [1]. This important enzyme initiates both the urea cycle and the 
biosynthesis of arginine and pyrimidines. 

Glutamine-dependent CPSase (CPSase II) is involved in the biosynthesis of 
pyrimidines and purines. In bacteria such as Escherichia coli, a single enzyme 
is involved in both biosynthetic pathways while other bacteria have separate 
enzymes. The bacterial enzymes are formed of two subunits. A small chain (gene 
car A) that provides glutamine amidotransf erase activity (GATase) necessary for 
removal of the ammonia group from glutamine, and a large chain (gene carB) 
that provides CPSase activity. Such a structure is also present in fungi for 
arginine biosynthesis (genes CPAl and CPA2). In most eukaryotes, the first 
three steps of pyrimidine biosynthesis are catalyzed by a large 
multifunctional enzyme - called URA2 in yeast, rudimentary in Drosophila and 
CAD in mammals [2]. The CPSase domain is located between an N-terminal GATase 
domain and the C-terminal part which encompass the dihydroorotase and 
aspartate transcarbamylase activities. 

Ammonia-dependent CPSase (CPSase I) is involved in the urea cycle in ureolytic 
vertebrates; it is a monofunctional protein located in the mitochondrial 
matrix. 

The CPSase domain is typically 120 Kd in size and has arisen from the 
duplication of an ancestral subdomain of about 500 amino acids. Each subdomain 
independently binds to ATP and it is suggested that the two homologous halves 
act separately, one to catalyze the phosphorylation of bicarbonate to carboxy 
phosphate and the other that of carbamate to carbamyl phosphate. 

The CPSase subdomain is also present in a single copy in the biotin-dependent 
enzymes acetyl-CoA carboxylase (EC 6.4.1.2) (ACC), propionyl-CoA carboxylase 
(EC 6.4.1.3) (PCCase), pyruvate carboxylase (EC 6.4.1.1) (PC) and urea 
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carboxylase (EC 6.3.4.6). 

Two conserved regions which are probably important for binding ATP and/or catalytic 
activity have been selected as signatures for the subdomain. 

-Consensus pattern: [FYV]-[PS]-[LIVMC]-[LIVMA]-[LIVM]-[KR]-[PSA]-[STA]-x(3)- 
[SG]-G-x-[AG] 

-Consensus pattern: [LIVMF]-[LIMN]-E-[LIVMCA]-N-[PATLIVM]-[KR]-[LIVMSTAC] 

[ 1] Simmer J.P., Kelly R.E., Rinker A.G. Jr., Scully J.L., Evans D.R. 

J. Biol. Chem. 265:10395-10402(1990). 
[ 2] Davidson J.N., Chen K.C., Jamison R.S., Musmanno L.A., Kern C.B. 

BioEssays 15:157-164(1993). 

86. CPSase_sm_chain (Carbamoyl-phosphate synthase small chain, CPSase domain) 
[1] 

Medline: 90285162 

Mammalian carbamyl phosphate synthetase (CPS). DNA sequence and 
evolution of the CPS domain of the Syrian hamster multifunctional 
protein CAD. 

Simmer JP, Kelly RE, Rinker AG Jr, Scully JL, Evans DR; 

Biol Chem 1990;265:10395-10402. 
The carbamoyl-phosphate synthase domain is in the amino terminus of 
protein. 

Carbamoyl-phosphate synthase catalyzes the ATP -dependent synthesis of 
carbamyl-phosphate from glutamine or ammonia and bicarbonate. This 
important enzyme initiates both the urea cycle and the biosynthesis 
of arginine and/or pyrimidines [1]. 

The carbamoyl-phosphate synthase (CPS) enzyme in prokaryotes is a 
heterodimer of a small and large chain. The small chain promotes 
the hydrolysis of glutamine to ammonia, which is used by the large 
chain to synthesize carbamoyl phosphate. See CPSase_L_chain. 
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The small chain has a GATase domain in the carboxyl terminus. 
See GATase. 
Number of members: 46 

Carbamoyl-phosphate synthase (CPSase) catalyzes the ATP-dependent synthesis of 
carbamyl-phosphate from glutamine (EC 6.3.5.5) or ammonia (EC 6.3.4.16) and 
bicarbonate [1], This important enzyme initiates both the urea cycle and the 
biosynthesis of arginine and pyrimidines. 

Glutamine-dependent CPSase (CPSase II) is involved in the biosynthesis of 
pyrimidines and purines. In bacteria such as Escherichia coli, a single enzyme 
is involved in both biosynthetic pathways while other bacteria have separate 
enzymes. The bacterial enzymes are formed of two subunits. A small chain (gene 
carA) that provides glutamine amidotransferase activity (GATase) necessary for 
removal of the ammonia group from glutamine, and a large chain (gene carB) 
that provides CPSase activity. Such a structure is also present in fungi for 
arginine biosynthesis (genes CPAl and CPA2). In most eukaryotes, the first 
three steps of pyrimidine biosynthesis are catalyzed by a large 
multifunctional enzyme - called URA2 in yeast, rudimentary in Drosophila and 
CAD in mammals [2]. The CPSase domain is located between an N-terminal GATase 
domain and the C-terminal part which encompass the dihydroorotase and 
aspartate transcarbamylase activities. 

Ammonia-dependent CPSase (CPSase I) is involved in the urea cycle in ureolytic 
vertebrates; it is a monofunctional protein located in the mitochondrial 
matrix. 

The CPSase domain is typically 120 Kd in size and has arisen from the 
duplication of an ancestral subdomain of about 500 amino acids. Each subdomain 
independently binds to ATP and it is suggested that the two homologous halves 
act separately, one to catalyze the phosphorylation of bicarbonate to carboxy 
phosphate and the other that of carbamate to carbamyl phosphate. 
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The CPSase subdomain is also present in a single copy in the biotin-dependent 
enzymes acetyl-CoA carboxylase (EC 6.4.1.2) (ACQ, propionyl-CoA carboxylase 
(EC 6.4.1.3) (PCCase), pyruvate carboxylase (EC 6.4.1.1) (PC) and urea 
carboxylase (EC 6.3.4.6). 

Two conserved regions which are probably important for binding ATP and/or catalytic 
activity have been selected as signatures for the subdomain. 

-Consensus pattern: [FYV]-[PS]-[LIVMC]-[LIVMA]-[LIVM]-[KR]-[PSA]-[STA]-x(3)- 
[SG]-G-x-[AG] 

-Consensus pattern: [LIVMF]-[LIMN]-E-[LIVMCA]-N-[PATLIVM]-[KR]-[LIVMSTAC] 

[ 1] Simmer J.P., Kelly R.E., Rinker A.G. Jr., Scully J.L., Evans D.R. 

J. Biol. Chem. 265:10395-10402(1990). 
[ 2] Davidson J.N., Chen K.C., Jamison R.S., Musmanno L.A., Kern C.B. 
BioEssays 15:157-164(1993). 

87. CRAL_TRIO (CRAL/TRIO domain) 
[1] 

Medline: 98121119 

Crystal structure of the Saccharomyces cerevisiae phosphatidyl- 
inositol-transfer protein. 

Sha B, Phillips SE, Bankaitis VA, Luo M; 
Nature 1998;391:506-510. 

The original profile has been extended to include the carboxyl 

domain from the known structure of Secl4. Swiss:P10911 has not 

been included in the Pfam family because it does not appear to 

contain a complete structural domain. 
Number of members: 39 



88. CSD ( 'Cold-shock' DNA-binding domain) 
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[1] 

Medline: 94255482 

Crystal structure of CspA, the major cold shock 
protein of Escherichia coli. 
Schindelin H, Jiang W, Inouye M, Heinemann U; 
Proc Natl Acad Sci U S A 1994;91:5119-5123. 
Number of members: 121 

A conserved domain of about 70 amino acids has been found in prokaryotic and 
eukaryotic DNA-binding proteins [1,2,3,E1]. This domain, which is known as the 
'cold-shock domain' (CSD) is present in the proteins listed below. 

- Escherichia coli protein CS7.4 (gene cspA) which is induced in response to 
low temperature (cold-shock protein) and which binds to and stimulates the 
transcription of the CCAAT-containing promoters of the HN-S protein and of 
gyrA. 

- Mammalian Y box binding protein 1 (YBl). A protein that binds to the CCAAT- 
containing Y box of mammalian HLA class II genes. 

- Xenopus Y box binding proteins -1 and -2 (Yl and Y2). Proteins that bind to 
the CCAAT-containing Y box of Xenopus hsp70 genes. 

- Xenopus B box binding protein (YB3). YB3 binds the B box promoter element 
of genes transcribed by RNA polymerase III. 

- Enhancer factor I subunit A (EFI-A) (dbpB). A protein that also bind to 
CCAAT-motif in various gene promoters. 

- DbpA, a Human DNA-binding protein of unknown specificity. 

- Bacillus subtilis cold-shock proteins cspB and cspC. 

- Streptomyces clavuligerus protein SC 7.0. 

- Escherichia coli proteins cspB, cspC, cspD, cspE and cspF. 

- Unr, a mammalian gene encoded upstream of the N-ras gene. Unr contains nine 
repeats that are similar to the CSD domain. The function of Unr is not yet 
known but it could be a multivalent DNA-binding protein. 

As a signature pattern for the CSD domain, its most conserved 
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region which is located in its N-terminal section has been selected. It must be noted that the 
beginning of this region is highly similar [4] to the RNP-1 RNA-binding motif. 

-Consensus pattern: [FY]-G-F-I-x(6,7)-[DER]-[LIVM]-F-x-H-x-[STKR]-x-[LIVMFY] 



[ 1] Doniger J., Landsman D., Gonda M.A., Wistow G. 

New Biol. 4:389-395(1992). 
[ 2] Wistow G. 

Nature 344:823-824(1990). 
[ 3] Jones P.G., Inouye M. 

Mol. Microbiol. 11:811-818(1994). 
[ 4] Landsman D. 

Nucleic Acids Res. 20:2861-2864(1992). 



89. CTF_NFI (CTF/NF-I family) 
Number of members: 45 

Nuclear factor I (NF-I) or CCAAT box-binding transcription factor (CTF) [1,2] 
(also known as TGGCA-binding proteins) are a family of vertebrate nuclear 
proteins which recognize and bind, as dimers, the palindromic DNA sequence 
5'-TGGCANNNTGCCA-3'. CTF/NF-I binding sites are present in viral and cellular 
promoters and in the origin of DNA replication of Adenovirus type 2. 

The CTF/NF-I proteins were first identified as nuclear factor I, a collection 
of proteins that activate the replication of several Adenovirus serotypes 
(together with NF-II and NF-III) [3]. The family of proteins was also 
identified as the CTF transcription factors, before the NFI and CTF families 
were found to be identical [4], The CTF/NF-I proteins are individually capable 
of activating transcription and DNA replication. The CTF/NF-I family name has 
also been dubbed as NFI, NF-I or NFI. 

In a given species, there are a large number of different CTF/NF-I proteins. 
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The multiplicity of CTF/NF-I is known to be generated both by alternative 
splicing and by the occurrence of four different genes. The known forms of 
NF-I genes have been classified as: 

- The CTF-like factors subfamily (prototype form: CTF-1) [4] 

- The NFI-X proteins. 

- The NFI-A proteins. 

- The NFI-B proteins. 

So far, all CTF/NF-I family members appear to have similar transcription and 
replication activities. 

CTF/NF-1 proteins contains 400 to 600 amino acids. The N-terminal 200 amino- 
acid sequence, almost perfectly conserved in all species and genes sequenced, 
mediates site-specific DNA recognition, protein dimerization and Adenovirus 
DNA replication. The C-terminal 100 amino acids contain the transcriptional 
activation domain. This activation domain is the target of gene expression 
regulatory pathways ellicited by growth factors and it interacts with basal 
transcription factors and with histone H3 [6]. 

A perfectly conserved, highly charged 12 residue peptide located in the N-terminal part of 
CTF/NF-I has been selected as a specific signature for this family of proteins. 

-Consensus pattern: R-K-R-K-Y-F-K-K-H-E-K-R 

[ 1] Mermod N., O'Neill E.A., Kelly T.J., Tjian R. 

Cell 58:741-753(1989). 
[ 2] Rupp R.A.W., Kruse U., Muhhaup G., Goebel U., Beyreuther K., 

Sippel A.E. 

Nucleic Acids Res. 18:2607-2616(1990). 
[ 3] Nagata K., Guggenheimer R.A., Enomoto T., Lichy J.H., Hurwitz J. 

Proc. Natl. Acad. Sci. U.S.A. 79:6438-6442(1982). 
[ 4] Santoro C, Mermod N., Andrews P.C., Tjian R. 
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Nature 334:2118-2224(1988). 
[ 5] Gil G., Smith J.R., Goldstein J.L., Slaughter C.A., Orth K., Brown M.S., 
Osborne T.F. 

Proc. Natl. Acad. Sci. U.S.A 85:8963-8967(1988). 
5 [ 6] Alevizopoulos A., Dusserre Y., Tsai-Pflugf elder M., von der Weld T., 
Wahli W., Mermod N. 
Genes Dev. 9:3051-3066(1995). 

1 0 90. Calsequestrin (Calsequestrin) 
Number of members: 13 

Calsequestrin is a moderate-affinity, high-capacity calcium-binding protein 
of cardiac and skeletal muscle [1], where it is located in the lumenal space 

15 of the sarcoplasmic reticulum terminal cisternae. Calsequestrin acts as a 
calcium buffer and plays an important role in the muscle excitation- 
contraction coupling. It is a highly acidic protein of about 400 amino acid 
residues that binds more than 40 moles of calcium per mole of protein. There 
are at least two different forms of calsequestrin: one which is expressed in 

2 0 cardiac muscles and another in skeletal muscles. Both forms have highly 
similar sequences. 

Two signature sequences have been developed. The first corresponds to the N- 
terminus of the mature protein, the second is located just in front of the 
2 5 C-terminus of the protein which is composed of a highly acidic tail of 
variable length. 

-Consensus pattern: [EQ]-[DE]-G-L-[DN]-F-P-x-Y-D-G-x-D-R-V 

-Consensus pattern: [DE]-L-E-D-W-[LIVM]-E-D-V-L-x-G-x-[LIVM]-N-T-E-D-D-D 

30 

[ 1] Treves S., Vilsen B., Chiozzi P., Andersen J.P., Zorzato F. 
Biochem. J. 283:767-772(1992). 
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91. Carboxyl_trans (Carboxyl transferase domain) 
[1] 

Medline: 93374821 

Primary structure of the monomer of the 12S subunit of 
transcarboxylase as deduced from DNA and characterization of the 
product expressed in Escherichia coli. 
Thornton CG, Kumar GK, Haase FC, Phillips NF, Woo SB, Park VM, 
Magner WJ, Shenoy BC, Wood HG, Samols D; 
J Bacterid 1993;175:5301-5308. 
[2] 

Medline: 93358891 

Molecular evolution of biotin-dependent carboxylases. 
Toh H, Kondo H, Tanabe T; 

Eur J Biochem 1993;215:687-696. 
All of the members in this family are biotin dependent carboxylases. 
The carboxyl transferase domain carries out the following reaction; 
transcarboxylation from biotin to an acceptor molecule. There are 
two recognised types of carboxyl transferase. One of them uses acyl-CoA 
and the other uses 2-oxo acid as the acceptor molecule of carbon dioxide. 
All of the members in this family utilise acyl-CoA as the acceptor 
molecule. 
Number of members: 47 

92. Chal_stil_synt (Chalcone and stilbene synthases) 
Number of members: 146 

Chalcone synthases (CHS) (EC 2.3.1.74) and stilbene synthases (STS) (formerly 
known as resveratrol synthases) are related plant enzymes [1]. CHS is an 
important enzyme in flavanoid biosynthesis and STS a key enzyme in stilbene- 
type phyloalexin biosynthesis. Both enzymes catalyze the addition of three 
molecules of malonyl-CoA to a starter CoA ester (a typical example is 
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4-coumaroyl-CoA), producing either a chalcone (with CHS) or stilbene (with 
STS). 

These enzymes are proteins of about 390 amino-acid residues. A conserved 
cysteine residue, located in the central section of these proteins, has been 
shown [2] to be essential for the catalytic activity of both enzymes and 
probably represents the binding site for the 4-coumaryl-CoA group. The region 
around this active site residue is well conserved and can be used as a 
signature pattern. 

In addition to the plant enzymes, this family also includes Bacillus subtilis 
bcsA. 

-Consensus pattern: R-[LIVMFYS]-x-[LIVM]-x-[QHG]-x-G-C-[FYNA]-[GA]-G-[GA]- 
[STAV]-x-[LIVMF]-[RA] [C is the active site residue] 

[ 1] Schroeder J., Schroeder G. 

Z. Naturforsch. 45C:1-8(1990). 
[ 2] Lanz T., Tropf S., Marner F.-J., Schroeder J., Schroeder G. 

J. Biol. Chem. 266:9971-9976(1991). 

93. Chorismate_synt (Chorismate synthase) 
Number of members: 19 

Chorismate synthase (EC 4.6.1.4) catalyzes the last of the seven steps in the 
shikimate pathway which is used in prokaryotes, fungi and plants for the 
biosynthesis of aromatic amino acids. It catalyzes the 1,4-trans elimination 
of the phosphate group from 5-enolpyruvylshikimate-3-phosphate (EPSP) to form 
chorismate which can then be used in phenylalanine, tyrosine or tryptophan 
biosynthesis. Chorismate synthase requires the presence of a reduced flavin 
mononucleotide (FMNH2 or FADH2) for its activity. 
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Chorismate synthase from various sources shows [1,2] a high degree of sequence 
conservation. It is a protein of about 360 to 400 amino-acid residues. 
Three signature patterns have been developed from conserved regions rich in basic 
residues (mostly arginines). The first is in the N-terminal section, the 
5 second is central and the third is C-terminal. 

-Consensus pattern: G-E-S-H-[GC]-x(2)-[LIVM]-[GTV]-x-[LIVM](2)-[DE]-G-x-[PV] 

-Consensus pattern: [GE]-R-[SA](2)-[SAG]-R-[EV]-[ST]-x(2)-[RH]-V-x(2)-G 
1 0 -Consensus pattern: R-[SH]-D-[PSV]-[CSAV]-x(4)-[GAI]-x-[IVGSP]-[LIVM]-x-E-[STAH]- 
[LIVM] 

[ 1] Schaller A., Schmid J., Leibinger U., Amrhein N. 
J. Biol. Chem. 266:21434-21438(1991). 
15 [2] Jones D.G.L., Reusser U., Braus G.H. 
Mol. Microbiol. 5:2143-2152(1991). 

94. Clat_adaptor_s (Clathrin adaptor complex small chain) 
2 0 Number of members: 21 

Clathrin coated vesicles (CCV) mediate intracellular membrane traffic such as 
receptor mediated endocytosis. In addition to clathrin, the CCV are composed 
of a number of other components including oligomeric complexes which are known 

2 5 as adaptor or clathrin assembly proteins (AP) complexes [1]. The adaptor 

complexes are believed to interact with the cytoplasmic tails of membrane 
proteins, leading to their selection and concentration. In mammals two type of 
adaptor complexes are known: AP-1 which is associated with the Golgi complex 
and AP-2 which is associated with the plasma membrane. Both AP-1 and AP-2 are 

3 0 heterotetramers that consist of two large chains - the adaptins - (gamma and 

beta' in AP-1; alpha and beta in AP-2); a medium chain (AP47 in AP-1; AP50 in 
AP-2) and a small chain (AP19 in AP-1; AP17 in AP-2). 
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The small chains of AP-1 and AP-2 are evolutionary related proteins of about 
18 Kd. Homologs of AP17 and AP19 have also been found in yeast (genes APSl/ 
YAP19 and APSIA'API?) [2,3,4]. AP17 and AP19 are also related to the zeta- 
chain [5] of coatomer (zeta-cop), a cytosolic protein complex that reversibly 
5 associates with Golgi membranes to form vesicles that mediate biosynthetic 
protein transport from the endoplasmic reticulum, via the Golgi up to the 
trans Golgi network. 

A conserved region in the central section of these proteins has been selected as a signature 
1 0 pattern. 

-Consensus pattern: [LIVM](2)-Y-[KR]-x(4)-L-Y-F 

[ 1] Pearse B.M., Robinson M.S. 
15 Annu. Rev. Cell Biol. 6:151-171(1990). 

[ 2] Kirchhausen T., Davis A.C., Frucht S., O'Brine Greco B., Payne G.S., 
Tubb B. 

J. Biol. Chem. 266:11153-11157(1991). 
[ 3] Nakai M., Takada T., Endo T. 
2 0 Biochim. Biophys. Acta 1174:282-284(1993). 

[ 4] Phan H.L., Finlay J. A., Chu D.S., Tan P.K., Kirchhausen T., Payne G.S. 

EMBO J. 13:1706-1717(1994). 
[ 5] Kuge O., Hara-Kuge S., Orci L., Ravazzola M., Amherdt M., Tanigawa G., 
Wieland F.T., Rothman J.E. 
2 5 J. Cell Biol. 123:1727-1734(1993). 

95. Clathrin_lg_ch (Clathrin light chain.) 
Number of members: 8 

30 

Clathrin [1,2] is the major coat-forming protein that encloses vesicles such 
as coated pits and forms cell surface patches involved in membrane traffic 
within eukaryotic cells. The clathrin coats (called triskelions) are composed 
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of three heavy chains (180 Kd) and three light chains (23 to 27 Kd). 

The clathrin light chains [3], which may help to properly orient the assembly 
and disassembly of the clathrin coats, bind non-covalently to the heavy chain, 
5 they also bind calcium and interact with the hsc70 uncoating ATPase. 

- In higher eukaryotes two genes code for distinct but related light chains: 
LC(a) and LC(b). Each of the two genes can yield, by tissue-specific 
alternative splicing, two separate forms which differ by the insertion of a 

10 sequence of respectively thirty or eighteen residues. There is, in the N- 
terminal part of the clathrin light chains a domain of twenty one amino 
acid residues which is perfectly conserved in LC(a) and LC(b). 

- In yeast there is a single light chain (gene CLCl) whose sequence is only 
distantly related to that of higher eukaryotes. 

15 

Two signature patterns have been developed for clathrin light chains. The first 
pattern is a heptapeptide from the center of the conserved N-terminal region 
of eukaryotic light chains; the second pattern is derived from a positively 
charged region located in the C-terminal extremity of all known clathrin light 

2 0 chains. 

-Consensus pattern: F-L-A-Q-Q-E-S 

[ 1] Keen J.H. 
25 Annu. Rev. Biochem. 59:415-438(1990). 
[ 2] Brodsky P.M. 

Science 242:1396-1402(1988). 
[ 3] Brodsky P.M., Hill B.L., Acton S.L., Naethke I., Wong D.H., 

Ponnambalam S., Parham P. 

3 0 Trends Biochem. Sci. 16:208-213(1991). 

96. (Clathrin repeat) 7-fold repeat in Clathrin and VPS 
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Each repeat is about 140 amino acids long. The repeats 
occur in the arm region of the Clathrin heavy chain. 
Number of members: 79 
[1] 

Medline: 92191269 

Folding and trimerization of clathrin subunits at the 
triskelion hub. 

Nathke IS, Heuser J, Lupas A, Stock J, Turck CW, Brodsky FM; 
Cell 1992;68:899-910. [2] 
Medline: 88097376 

Clathrin heavy chain: molecular cloning and complete primary 
structure. 

Kirchhausen T, Harrison SC, Chow EP, Mattaliano RJ, 
Ramachandran KL, Smart J, Brosius J; 
Proc Natl Acad Sci U S A 1987;84:8805-8809. 

97. Collagen (Collagen triple helix repeat (20 copies)) 
[1] Medline: 94059583 
New members of the collagen superfamily 

Mayne R, Brewton RG; 
Curr Opin Cell Biol 1993;5:883-890. 

Scurvy is associated with collagens. 

Members of this family belong to the collagen superfamily [1]. 
Collagens are generally extracellular structural proteins 
involved in formation of connective tissue structure. 
The alignment contains 20 copies of the G-X-Y repeat that 
forms a triple helix. The first position of the repeat is 
glycine, the second and third positions can be any residue 
but are frequently proline and hydroxyproline. Collagens 
are post translationally modified by proline hydoxylase 
to form the hydroxyproline residues. Defective 
hydroxylation is the cause of scurvy. 
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Some members of the collagen superfamily are not involved 
in connective tissue structure but share the same triple 
helical structure. 
Number of members: 2125 

5 

98. Coprogen_oxidas (Coproporphyrinogen III oxidase) 
Number of members: 12 

Coproporphyrinogen III oxidase (EC 1.3.3.3) (coproporphyrinogenase) [1,2] 
1 0 catalyzes the oxidative decarboxylation of coproporphyrinogen III into 

protoporphyrinogen IX, a common step in the pathway for the biosynthesis of 
porphyrins such as heme, chlorophyll or cobalamin. 

Coproporphyrinogen III oxidase is an enzyme that requires iron for its 
15 activity. A cysteine seems to be important for the catalytic mechanism [3]. 

Sequences from a variety of eukaryotic and prokaryotic sources show that 
this enzyme has been evolutionarily conserved. A highly conserved region in 
the central part of the sequence has been selected as a signature pattern. This 
region contains the only conserved cysteine and is rich in charged amino 
2 0 acids. 

-Consensus pattern : K-x-W-C-x(2)- [FYH] (3)- [LIVM]-x-H-R-x-E-x-R-G-[LIVM] -G-G- 
[LIVM]-F-F-D 

25 [ 1] XuK., Elliott T. 

J. Bacteriol. 175:4990-4999(1993). 
[ 2] Kohno H., Furukawa T., Yoshinaga T., Tokunaga R., Taketani S. 

J. Biol. Chem. 268:21359-21363(1993). 
[ 3] Camadro J.M., Chambon H., Jolles J., Labbe P. 
30 Eur. J. Biochem. 156:579-587(1986). 
[ 4] Xu K., Elliott T. 
J. Bacteriol. 176:3196-3203(1994). 
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99. Corona_nucleoca (Coronavirus nucleocapsid protein) 
[1] 

Medline: 98087828 
5 Identification of a specific interaction between the 

coronavirus mouse hepatitis virus A59 nucleocapsid protein 
and packaging signal. 
Molenkamp R, Spaan WJ; 
Virology 1997;239:78-86. 
10 Number of members: 44 

100. Cu-oxidase (Multicopper oxidase) 
[1] 

15 Medline: 90126844 

The blue oxidases, ascorbate oxidase, laccase and ceruloplasmin. 
Modelling and structural relationships. 
Messerschmidt A, Huber R; 
Eur J Biochem 1990;187:341-352. 
2 0 Number of members: 150 

Multicopper oxidases [1,2] are enzymes that possess three spectroscopically 
different copper centers. These centers are called: type 1 (or blue), type 2 
(or normal) and type 3 (or coupled binuclear). The enzymes that belong to 

2 5 this family are: 

- Laccase (EC 1.10.3.2) (urishiol oxidase), an enzyme found in fungi and 
plants, which oxidizes many different types of phenols and diamines. 

- Ascorbate oxidase (EC 1.10.3.3), a higher plant enzyme. 

3 0 - Ceruloplasmin (EC 1.16.3.1) (ferroxidase), a protein found in the serum of 

mammals and birds, which oxidizes a great variety of inorganic and organic 
substances. Structurally ceruloplasmin exhibits internal sequence homology, 
and seem to have evolved from the triplication of a copper-binding domain 
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similar to that found in laccase and ascorbate oxidase. 

In addition to the above enzymes there are a number of proteins which, on the 
basis of sequence similarities, can be said to belong to this family. These 
5 proteins are: 

- Copper resistance protein A (copA) from a plasmid in Pseudomonas syringae. 
This protein seems to be involved in the resistance of the microbial host 

to copper. 

1 0 - Blood coagulation factor V (Fa V). 

- Blood coagulation factor VIII (Fa VIII) [El]. 

- Yeast FET3 [3], v^^hich is required for ferrous iron uptake. 

-Yeast hypothetical protein YFL041w and SpAClF7.08, the fission yeast 
homolog. 

15 

Factors V and VIII act as cofactors in blood coagulation and are structurally 
similar [4]. Their sequence consists of a triplicated A domain, a B domain and 
a duplicated C domain; in the following order: A-A-B-A-C-C. The A-type domain 
is related to the multicopper oxidases. 

20 

Two signature patterns have been developed for these proteins. Both patterns are 
derived from the same region, which in ascorbate oxidase, laccase, in the 
third domain of ceruloplasmin, and in copA, contains five residues that are 
known to be involved in the binding of copper centers. The first pattern does 
25 not make any assumption on the presence of copper-binding residues and thus 
can detect domains that have lost the ability to bind copper (such as those in 
Fa V and Fa VIII), while the second pattern is specific to copper-binding 
domains. 

3 0 -Consensus pattern: G-x-[FYW]-x-[LIVMFYW]-x-[CST]-x(8)-G-[LM]-x(3)-[LIVMFYW] 
-Consensus pattern: H-C-H-x(3)-H-x(3)-[AG]-[LM] 
[The first two H's are copper type 3 binding residues] 
[The C, the 3rd H, and L or M are copper type 1 ligands] 
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101. Cullin (Cullin family) 
Number of members: 24 

5 

The following proteins are collectively termed cullins [1]: 

- Caenorhabditis elegans cul-1 (or lin-19), a protein required for 
developmentally programmed transitions from the Gl phase of the cell cycle 

10 to the GO phase or the apoptotic pathway. 

- Caenorhabditis elegans cul-2, cuI-3, cul-4 (F45E12.3), cul-5 (ZK856.1) and 
cul-6 (K08E7.7). 

- Mammalian CULl, CUL2, CUL3, CUL4A and CUL4B. 

- Mammalian vasopressin-activated calcium-mobilizing receptor (VACM-1), a 
1 5 kidney-specific protein thought to form a cell surface receptor [2] but 

which does not have any structural hallmarks of a receptor. 

- Drosophila linl9. 

- Yeast CDC53 [3], which acts in concert with CDC4 and UBC3 (CDC34) to 
control the Gl-to-S phase transition. 

2 0 - Yeast hypothetical protein YGR003w. 

- Fission yeast hypothetical protein SpAC24H6.03. 

The cullins are hydrophilic proteins of 740 to 815 amino acids. The C-terminal 
extremity is the most conserved part of these proteins. A 
25 signature pattern has been developed from that region. 

-Consensus pattern: [LIV]-K-x(2)-[LIV]-x(2)-L-l-[DEQ]-[KRHNQ]-x-Y-[LIVM]-x-R- 
x(6,7)-[FY]-x-Y-x-[SA]> 

3 0 [1] Kipreos E.T., Lander L.E., Wing J.P., He W.W., Hedgecock E.M. 

Cell 85:829-839(1996). 
[ 2] Burnatowska-Hledin M.A., Spielman W.S., Smith W.L., Shi P., Meyer J.M., 
Dewitt D.L. 
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Am. J. Physiol. 268:fll98-F1210(1995). 
[ 3] Mathias N., Johnson Winey M., Adams A.E., Goetsch L., Pringle J.R., 
Byers B., Goebl M.G. 
Mol. Cell. Biol. 16:6634-6643(1996). 

102. (Cu_amine_oxid) 

Copper amine oxidase signatures 

Amine oxidases (AO) [1] are enzymes that catalyze the oxidation of a wide range of biogenic 
amines including many neurotransmitters, histamine and xenobiotic amines. There are two 
classes of amine oxidases: flavin-containing (EC 1.4.3.4) and copper-containing (EC 1.4.3.6). 

Copper-containing AO is found in bacteria, fungi, plants and animals, it is an homodimeric 
enzyme that binds one copper ion per subunit as well as a 2,4,5- trihydroxyphenylalanine 
quinone (or topaquinone) (TPQ) cofactor. This cofactor is derived from a tyrosine residue. 

Two signature patterns were derived for copper AO, the first one contains the tyrosine which 
give rises to the TPQ cofactor while the second one contains one of the three histidines 
that bind the copper atom [2]. 

Consensus pattern[LIVM]-[LIVMA]-[LIVMF]-x(4)-[ST]-x(2)-N-Y-[DE]-[YN] [The first Y 
gives rises to TPQ] Sequences known to belong to this class detected by the patternALL. 

Consensus patternT-x-[GS]-x(2)-H-[LIVMF]-x(3)-E-[DE]-x-P [H is a copper ligand] 
Sequences known to belong to this class detected by the pattern ALL, except for lentil AO. 

[ 1] Knowles P.P., Dooley D.M. (In) Metal ions in biological systems; Sigel H., Sigel A., 
Eds., 30:361- 403, Marcel Dekker, New- York, (1993). 

[ 2] Parsons M.R., Convery M.A., Wilmot CM., Yadav K.D.S., Blakeley V., Corner A.S., 
Phillips S.E.V., McPherson M.J., Knowles P.P. Structure 3:1171-1184(1995). 



103. Cys-protease (Cysteine protease) 
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Number of members: 358 

Eukaryotic thiol proteases (EC 3.4.22.-) [1] are a family of proteolytic 
enzymes which contain an active site cysteine. Catalysis proceeds through a 
thioester intermediate and is facilitated by a nearby histidine side chain; an 
asparagine completes the essential catalytic triad. The proteases which are 
currently known to belong to this family are listed below (references are 
only provided for recently determined sequences). 

-Vertebrate lysosomal cathepsins B (EC 3.4.22.1), H (EC 3.4.22.16), L 
(EC 3.4.22.15), and S (EC 3.4.22.27) [2]. 

- Vertebrate lysosomal dipeptidyl peptidase I (EC 3.4.14.1) (also known as 
cathepsin C) [2]. 

- Vertebrate calpains (EC 3.4.22.17). Calpains are intracellular calcium- 
activated thiol protease that contain both a N-terminal catalytic domain 
and a C-terminal calcium-binding domain. 

- Mammalian cathepsin K, which seems involved in osteoclastic bone resorption 
[3]. 

- Human cathepsin O [4]. 

- Bleomycin hydrolase. An enzyme that catalyzes the inactivation of the 
antitumor drug BLM (a glycopeptide). 

- Plant enzymes: barley aleurain (EC 3.4.22.16), EP-B1/B4; kidney bean EP-Cl, 
rice bean SH-EP; kiwi fruit actinidin (EC 3.4.22.14); papaya latex papain 
(EC 3.4.22.2), chymopapain (EC 3.4.22.6), caricain (EC 3.4.22.30), and 
proteinase IV (EC 3.4.22.25); pea turgor-responsive protein 15 A; pineapple 
stem bromelain (EC 3.4.22.32): rape COT44: rice oryzain alpha, beta, and 
gamma; tomato low-temperature induced, Arabidopsis thaliana A494, RD19A and 
RD21A. 

- House-dust mites allergens DerPl and EurMl. 

- Cathepsin B-iike proteinases from the worms Caenorhabditis elegans (genes 
gcp-1, cpr-3, cpr-4, cpr-5 and cpr-6), Schistosoma mansoni (antigen SM31) 
and Japonica (antigen SJ31), Haemonchus contortus (genes AC-1 and AC-2), 
and Ostertagia ostertagi (CP-1 and CP-3). 
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- Slime mold cysteine proteinases CPl and CP2. 

- Cruzipain from Trypanosoma cruzi and brucei. 

- Throphozoite cysteine proteinase (TCP) from various Plasmodium species. 

- Proteases from Leishmania mexicana, Theileria annulata and Theileria parva. 

- Baculoviruses cathepsin-like enzyme (v-cath). 

- Drosophila small optic lobes protein (gene sol), a neuronal protein that 
contains a calpain-like domain. 

- Yeast thiol protease BLHIA'CPI/LAPS. 

- Caenorhabditis elegans hypothetical protein C06G4.2, a calpain-like 
protein. 

Two bacterial peptidases are also part of this family: 

- Aminopeptidase C from Lactococcus lactis (gene pepC) [5]. 

- Thiol protease tpr from Porphyromonas gingivalis. 

Three other proteins are structurally related to this family, but may have 
lost their proteolytic activity. 

- Soybean oil body protein P34. This protein has its active site cysteine 
replaced by a glycine. 

- Rat testin, a Sertoli cell secretory protein highly similar to cathepsin L 
but with the active site cysteine is replaced by a serine. Rat testin 
should not be confused with mouse testin which is a LIM-domain protein (see 
<PDOC00382>). 

- Plasmodium falciparum serine-repeat protein (SERA), the major blood stage 
antigen. This protein of 111 Kd possesses a C-terminal thiol-protease-like 
domain [6], but the active site cysteine is replaced by a serine. 

The sequences around the three active site residues are well conserved and can 
be used as signature patterns. 
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-Consensus pattern: Q-x(3)-[GE]-x-C-[YW]-x(2)-[STAGC]-[STAGCV] [C is the active site 
residue] 

-Consensus pattern: [LIVMGSTAN]-x-H-[GSACE]-[LIVM]-x-[LIVMAT](2)-G-x- 
[GSADNH] [H is the active site residue] 

5 

-Consensus pattern: [FYCH]-[WI]-[LIVT]-x-[KRQAG]-N-[ST]-W-x(3)-[FYW]-G-x(2)-G- 
[LFYW]-[LIVMFYG]-x-[LIVMF] [N is the active site residue] 

[ 1] Dufour E. Biochimie 70:1335-1342(1988). 
10 [2] Kirschke H., Barrett A.J., Rawlings N.D. Protein Prof. 2:1587-1643(1995). 

[ 3] Shi G.-P., Chapman H.A., Bhairi S.M., Deleeuw C, Reddy V.Y., Weiss S.J. FEBS Lett. 
357:129-134(1995). 

[ 4] Velasco G., Ferrando A.A., Puente X.S., Sanchez L.M., Lopez-Otin C. J. Biol. Chem. 
269:27136-27142(1994). 
15 [5] Chapot-Chartier M.P., Nardi M., Chopin M.C., Chopin A., Gripon J.C. Appl. Environ. 
Microbiol. 59:330-333(1993). 

[ 6] Higgins D.G., McConnell D.J., Sharp P.M. Nature 340:604-604(1989). 
[ 7] Rawlings N.D., Barrett A.J. Meth. Enzymol. 244:461-486(1994). 

20 

104. Cys_Met_Meta_PP (Cys/Met metabolism PLP -dependent enzyme) 
[1] Medline: 96428687 

Crystal structure of the pyridoxal-5'-phosphate dependent 
cystathionine beta-lyase from Escherichia coli at 1.83 A. 

2 5 Clausen T, Huber R, Laber B, Pohlenz HD, Messerschmidt A; 

J Mol Biol 1996;262:202-224. 
[1] Medline: 99059720 

Crystal structure of Escherichia coli cystathionine 
gamma-synthase at 1.5 A resolution. 

3 0 Clausen T, Huber R, Prade L, Wahl MC, Messerschmidt A; 

EMBO J 1998;17:6827-6838. 
Database Reference: SCOP; Icsl; fa; [SCOP-USA] [CATH-PDBSUM] 
This family includes enzymes involved in cysteine and 
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methionine metabolism. The following are members: 

Cystathionine gamma-lyase, 

Cystathionine gamma-synthase. 

Cystathionine beta-lyase, 

Methionine gamma-lyase, 

OAH/OAS sulfhydrylase, 

O-succinylhomoserine sulphhydrylase 

All of these members participate is slightly different reactions. 
All these enzymes use PLP (pyridoxal-5'-phosphate) as a cofactor. 
Number of members: 52 

A number of pyridoxal-dependent enzymes involved in the metabolism of 
cysteine, homocysteine and methionine have been shown [1,2] to be evolutionary 
related. These are: 

-Cystathionine gamma-lyase (EC 4.4.1.1) (gamma-cystathionase), which 
catalyzes the transformation of cystathionine into cysteine, oxobutanoate 
and ammonia. This is the final reaction in the transulfuration pathway that 
leads from methionine to cysteine in eukaryotes. 

- Cystathionine gamma-synthase (EC 4.2.99.9), which catalyzes the conversion 
of cysteine and succinyl-homoserine into cystathionine and succinate: the 
first step in the biosynthesis of methionine from cysteine in bacteria 
(gene metB). 

- Cystathionine beta-lyase (EC 4.4.1.8) (beta-cystathionase), which catalyzes 
the conversion of cystathionine into homocysteine, pyruvate and ammonia: 
the second step in the biosynthesis of methionine from cysteine in bacteria 
(gene metC). 

- Methionine gamma-lyase (EC 4.4.1.11) (L-methioninase) which catalyzes the 
transformation of methionine into methanethiol, oxobutanoate and ammonia. 

- OAH/OAS sulfhydrylase, which catalyzes the conversion of acetylhomoserine 
into homocysteine and that of acetylserine into cysteine (gene MET17 or 
MET25 in yeast). 

- O-succinylhomoserine sulfhydrylase (EC 4.2.99.-). 
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- Yeast hypothetical protein YGL184c. 

- Yeast hypothetical protein YHR112c. 

These enzymes are proteins of about 400 amino-acid residues. The pyridoxal-P 
5 group is attached to a lysine residue located in the central section of these 

enzymes; the sequence around this residue is highly conserved and can be used 
as a signature pattern to detect this class of enzymes. 

-Consensus pattern: [DQ]-[LIVMF]-x(3)-[STAGC]-[STAGCI]-T-K-[FYWQ]-[LIVMF]-x-G- 
1 0 [HQ]-[SGNH] [K is the pyridoxal-P attachment site] 

[ 1] Ono B.I., Tanaka K., Naito K., Heike C., Shinoda S., Yamamoto S., 
Ohmori S., Oshima T., Toh-E A. 
J. Bacteriol. 174:3339-3347(1992). 
15 [2] Barton A.B., Kaback D.B., Clark M.W., Keng T., Ouellette B.F.F., 
Storms R.K., Zeng B., Zhong W.W., Fortin N., Delaney S., Bussey H. 
Yeast 9:363-369(1993). 

2 0 105. Cyt_reductase 

FAD/NAD-binding Cytochrome reductase 
Number of members: 60 
[1] Medline: 95111952 

Crystal structure of the FAD-containing fragment of corn 

2 5 nitrate reductase at 2.5 A resolution: relationship to other 

flavoprotein reductases. 
Lu G, Campbell WH, Schneider G, Lindqvist Y; 

Structure 1994;2:809-821. 
[2] Medline: 92084635 

3 0 The sequence of squash NADH:nitrate reductase and its 

relationship to the sequences of other flavoprotein 
oxidoreductases. A family of flavoprotein pyridine 
nucleotide cytochrome reductases. 
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Hyde GE, Crawford NM, Campbell WH; 
J Biol Chem 1991;266:23542-23547. 

106. Cytidylyltrans 
Phosphatidate cytidylyltransferase 
Number of members: 21 

Phosphatidate cytidylyltransferase (EC 2.7.7.41) [1,2,3] (also known as CDP- 
diacylglycerol synthase) (CDS) is the enzyme that catalyzes the synthesis of 
CDP-diacylglycerol from CTP and phosphatidate (PA). CDP-diacylglycerol is an 
important branch point intermediate in both prokaryotic and eukaryotic 
organisms. CDS is a membrane-bound enzyme. A conserved region located in the 
C-terminal part has been selected as a signature pattern. 

-Consensus pattern: S-x-[LIVMF]-K-R-x(4)-K-D-x-[GSA]-x(2)-[LI]-[PG]-x-H-G-G- 
[LIVM]-x-D-R-[LIVMF]-D 

[ 1] Sparrow CP., Raetz C.R.H. 

J. Biol. Chem. 260:12084-12091(1985). 
[ 2] Shen H., Heacock P.N., Clancey C.J., Dowhan W. 

J. Biol. Chem. 271:789-795(1996). 
[ 3] Saito S., Goto K., Tonosaki A., Kondo H. 

J. Biol. Chem. 272:9503-9509(1997). 

107. (Cytidylyltransf) Cytidylyltransferase. This family includes: Cholinephosphate 
cytidylyltransferase. Glycerol-3 -phosphate cytidylyltransferase. 
Number of members: 64 

[1] Medline: 10208837 CTP:Phosphocholine Cytidylyltransferase: Insights into Regulatory 
Mechanisms and Novel Functions. Clement JM, Kent C; Biochem Biophys Res Commun 
1999;257:643-650. 
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108. (cNMP binding) Cyclic nucleotide-binding domain signatures and profile 
Proteins that bind cyclic nucleotides (cAMP or cGMP) share a structural domainof about 120 
residues [1-3]. The best studied of these proteins is theprokaryotic catabolite gene activator 
(also known as the cAMP receptorprotein) (gene crp) where such a domain is known to be 
composed of threealpha-helices and a distinctive eight-stranded, antiparailel beta- 
barrelstructure. Such a domain is known to exist in the following proteins: - Prokaryotic 
catabolite gene activator protein (CAP). - cAMP- and cGMP-dependent protein kinases 
(cAPK and cGPK). Both types of kinases contains two tandem copies of the cyclic 
nucleotide-binding domain. The cAPK's are composed of two different subunits: a catalytic 
chain and a regulatory chain which contains both copies of the domain. The cGPK's are 
single chain enzymes that include the two copies of the domain in their N- terminal section. 
The nucleotide specificity of cAPK and cGPK is due to an amino acid in the conserved 
region of beta-barrel 7: a threonine that is invariant in cGPK is an alanine in most cAPK. - 
Vertebrate cyclic nucleotide-gated ion-channels. Two such cations channels have been fully 
characterized. One is found in rod cells where it plays a role in visual signal transduction. It 
specifically binds to cGMP leading to an opening of the channel and thereby causing a 
depolarization of rod photoreceptors. In olfactory epithelium a similar, cAMP -binding, 
channel plays a role in odorant signal transduction. There are six invariant amino acids in 
this domain, three of which are glycine residues that are thought to be essential for 
maintenance of the of the beta-barrel. Two signature patterns for this domain have been 
developed. The first pattern is located within beta-barrels 2 and 3 and contains the first two 
conserved Gly. The second pattern is located within beta-barrels 6 and 7 and contains the 
third conserved Gly as well as the three other invariant residues.- 

First consensus pattern: [LIVM]-[VIC]-x(2)-G-[DENQTA]-x-[GAC]-x(2)-[LIVMFY](4)- 
x(2)-G 

Second consensus pattern: [LIVMF]-G-E-x-[GAS]-[LIVM]-x(5,ll)-R-[STAQ]-A-x- 
[LIVMA]-x- [STACV]- 



[ 1] Weber I.T., Shabb J.B., Corbin J.D. Biochemistry 28:6122-6127(1989). 

[ 2] Kaupp U.B. Trends Neurosci. 14:150-157(1991). 

[ 3] Shabb J.B., Corbin J.D. J. Biol. Chem. 267:5723-5726(1992). 
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109. (cadherin) 

Cadherins extracellular repeated domain signature 
5 Cadherins [1,2] are a family of animal glycoproteins responsible for calcium-dependent cell- 
cell adhesion. Cadherins preferentially interact with themselves in a hemophilic manner in 
connecting cells; thus acting as both receptor and ligand. A wide number of tissue-specific 
forms of cadherins are known: 

1 0 - Epithelial (E-cadherin) (also known as uvomorulin or L-CAM) (CDHl). 

- Neural (N-cadherin) (CDH2). 

- Placental (P-cadherin) (CDH3). 

- Retinal (R-cadherin) (CDH4). 

- Vascular endothelial (VE-cadherin) (CDH5). 
1 5 - Kidney (K-cadherin) (CDH6). 

- Cadherin-8 (CDH8). 

- Osteoblast (OB-cadherin) (CDHll). 

- Brain (BR-cadherin) (CDHl 2). 

- T-cadherin (truncated cadherin) (CDH13). 
2 0 - Muscle (M-cadherin) (CDH14). 

- Liver-intestine (Ll-cadherin). 

- EP-cadherin. 

Structurally, cadherins are built of the following domains: a signal sequence, followed by a 
25 propeptide of about 130 residues, then an extracellular domain of around 600 residues, then a 
transmembrane region, and finally a C-terminal cytoplasmic domain of about 150 residues. 
The extracellular domain can be sub- divided into five parts: there are four repeats of about 
110 residues followed by a region that contains four conserved cysteines. It is suggested that 
the calcium-binding region of cadherins is located in the extracellular repeats. 

30 

Cadherins are evolutionary related to the desmogleins which are component of intercellular 
desmosome junctions involved in the interaction of plaque proteins: 
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- Desmoglein 1 (desmosomal glycoprotein I), 

- Desmoglein 2. 

- Desmoglein 3 (Pemphigus vulgaris antigen). 

5 The Drosophila fat protein [3] is a huge protein of over 5000 amino acids that contains 34 
cadherin-like repeats in its extracellular domain. 

The signature pattern that was developed for the repeated domain is located in it the C- 
terminal extremity which is its best conserved region. The pattern includes two conserved 
10 aspartic acid residues as well as two asparagines; these residues could be implicated in the 
binding of calcium. 

Consensus pattern[LIV]-x-[LIV]-x-D-x-N-D-[NH]-x-P Sequences known to belong to this 
class detected by the pattern ALL. Note this pattern is found in the first, second, and fourth 
15 copies of the repeated domain. In the third copy there is a deletion of one residue after the 
second conserved Asp. 

[ 1] Takeichi M. Annu. Rev. Biochem. 59:237-252(1990). 
[ 2] Takeichi M. Trends Genet. 3:213-217(1987). 
20 [3] Mahoney P.A., Weber U., Onofrechuk P., Biessmann H., Bryant P.J., Goodman C.S. Cell 
67:853-868(1991). 

110. Calreticulin family signatures 

25 Calreticulin [1] (also known as calregulin, CRP55 or HACBP) is a high-capacitycalcium- 
binding protein which is present in most tissues and located at the periphery of the 
endoplasmic (ER) and the sarcoplamic reticulum (SR)membranes. It probably plays a role in 
the storage of calcium in the lumen ofthe ER and SR and it may well have other important 
functions. Structurally, calreticulin is a protein of about 400 amino acid residues consisting of 

3 0 three domains: a) An N-terminal, probably globular, domain of about 180 amino acid 

residues (N-domain); b) A central domain of about 70 residues (P -domain) which contains 
three repeats of an acidic 17 amino acid motif. This region binds calcium with a low- 
capacity, but a high-affinity; c) A C-terminal domain rich in acidic residues and in lysine (C- 
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domain). This region binds calcium with a high-capacity but a low-affinity. Calreticulin is 
evolutionary related to the following proteins: - Onchocerca volvulus antigen RAL-1. RAL-1 
is highly similar to calreticulin, but possesses a C-terminal domain rich in lysine and arginine 
and lacks acidic residues and is therefore not expected to bind calcium in that region. - 
Calnexin [2]. A calcium-binding protein that interacts with newly synthesized glycoproteins 
in the endoplasmic reticulum. It seems to play a major role in the quality control apparatus of 
the ER by the retention of incorrectly folded proteins. - Calmegin [3] (or calnexin-T), a testis- 
specific calcium-binding protein highly similar to calnexin. Three signature patterns have 
been developed for this family of proteins. The first two patterns are based on conserved 
regions in the N-domain; the third pattern corresponds to positions 4 to 16 of the repeated 
motif in the P-domain. 

Consensus pattern: [KRHN] -x- [DEQN]-[DEQNK] -x(3)-C-G-G- [ AG] -[FY] -[LIVM] -[KN] - 
[LIVMFY](2)- 

Consensus pattern: [LIVM](2)-F-G-P-D-x-C-[AG]- 

Consensus pattern: [IV]-x-D-x-[DENST]-x(2)-K-P-[DEH]-D-W-[DEN]- 

[ 1] Michalak M., Milner R.E., Burns K., Opas M. Biochem. J. 285:681-692(1992). 

[ 2] Bergeron J.J.M., Brenner M.B., Thomas D.Y., Williams D.B. Trends Biochem. Sci. 

19:124-128(1994). 

[ 3] Watanabe D., Yamada K., Nishina Y., Tajima Y., Koshimizu U., Nagata A., Nishimune 
Y. J. Biol. Chem. 269:7744-7749(1994). 

111. Eukaryo tic-type carbonic anhydrases signature (carb_anhydrase) 

Carbonic anhydrases (EC 4.2.1.1 ) (CA) [1,2,3,4] are zinc metalloenzymes which catalyze the 
reversible hydration of carbon dioxide. Eight enzymatic and evolutionary related forms of 
carbonic anhydrase are currently known to exist in vertebrates: three cytosolic isozymes (CA- 
I, CA-II and CA-III); two membrane-bound forms (CA-IV and CA-VII); a mitochondrial 
form (CA-V); a secreted salivary form (CA-VI); and a yet uncharacterized isozyme [5]. In the 
alga Chlamydomonas reinhardtii, two CA isozymes have been sequenced[6]. They are 
periplasmic glycoproteins evolutionary related to vertebrate CAs. Some bacteria, such as 
Neisseria gonorrhoeae [7] also have a eukaryotic-type CA.CAs contain a single zinc atom 
bound to three conserved histidine residues. As a signature for CAs, a pattern has been 
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developed which includes one of these zinc-binding histidines. Protein D8 from Vaccinia and 
other poxviruses is related to CAs but has lost two of the zinc-binding histidines as well as 
many otherwise conserved residues. This is also true of the N-terminal extracellular domain 
of some receptor-type tyrosine-protein phosphatases (see <PDOC00323>). 
5 Consensus pattern: S-E-[HN]-x-[LIVM]-x(4)-[FYH]-x(2)-E-[LIVMGA]-H-[LIVMFA](2) 
[The second H is a zinc ligand]- 

Note: most prokaryotic CA's as well as plant chloroplast CA's belong to another, evolutionary 
distinct family of proteins (see < PDOC00586 

10 [1] Deutsch H.F. Int. J. Biochem. 19:101-113(1987). 

[ 2] Fernley R.T. Trends Biochem. Sci. 13:356-359(1988). 

[ 3] Tashian R.E. BioEssays 10:186-192(1989). 

[ 4] Edwards Y. Biochem. Soc. Trans. 18:171-175(1990). 

[ 5] Skaggs L.A., Bergenhem N.C.H., Venta P.J., Tashian R.E. Gene 126:291-292(1993). 
15 [6] Fujiwara S., Fukuzawa H., Tachiki A., Miyachi S. Proc. Natl. Acad. Sci. U.S.A. 87:9779- 
9783(1990). 

[ 7] Huang S., Xue ¥., Sauer-Eriksson E., Chirica L., Lindskog S.,Jonsson B.H. 2.3.CQ;2-"J. 
Mol. Biol. 283:301-310(1998^ 

20 

112. Caseins alpha/beta signature 

Caseins [1] are the major protein constituent of milk. Caseins can be classified into two 
families; the first consists of the kappa-caseins, and the second groups the alpha-sl, alpha-s2, 
and beta-caseins. The alpha/beta caseins are a rapidly diverging family of proteins. However 
25 two regions are conserved: a cluster of phosphorylated serine residues and the signal 

sequence. The signature pattern has been developed for this family of proteins based upon 
the last eight residues of the signal sequence. 
Consensus pattern: C-L-[LV]-A-x-A-[LVF]-A - 

30 [1] Holt C, Sawyer L. Protein Eng. 2:251-259(1988). 



113. Catalase signatures 
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Catalase (EC 1.11.1.6 ) [1,2,3] is an enzyme, present in all aerobic cells.that decomposes 
hydrogen peroxide to molecular oxygen and water. Its main function is to protect cells from 
the toxic effects of hydrogen peroxide. In eukaryotic organisms and in some prokaryotes 
catalase is a molecule composed of four identical subunits. Each of the subunits binds one 
protoheme IX group. A conserved tyrosine serves as the heme proximal side ligand. The 
region around this residue has been used as a first signature pattern; it also includes a 
conserved arginine that participates in heme-binding. A conserved histidine has been shown 
to be important for the catalytic mechanism of the enzyme. The region around this residue 
has been selected as a second signature pattern.- 

Consensus pattern: R-[LIVMFSTAN]-F-[GASTNP]-Y-x-D-[AST]-[QEH] [Y is the proximal 
heme-binding ligand] 

Consensus pattern: [IF]-x-[RH]-x(4)-[EQ]-R-x(2)-H-x(2)-[GAS]-[GASTF]-[GAST] [H is an 
active site residue] 

Note: some prokaryotic catalases belong to the peroxidase family (see <PDOC00394>). 

[ 1] Murthy M.R.N., Reid T.J. Ill, Sicignano A., Tanaka N., Rossmann M.G. J. Mol. Biol. 
152:465-499(1981). 

[ 2] Melik-Adamyan W.R., Barynin V.V., Vagin A.A., Borisov V.V., Vainshtein B.K., Fita 

I., Murthy M.R.N., Rossmann M.G. J. Mol. Biol. 188:63-72(1986). 

[ 3] von Ossowki I., Hausner G., Loewen P.C. J. Mol. Evol. 37:71-76(1993). 

114. (chitin binding) Chitin recognition or binding domain signature 

A conserved domain of 43 amino acids is found in several plant and fungal proteins that have 
a common binding specificity for oligosaccharides of N-acetylglucosamine [1]. This domain 
may be involved in the recognition or binding of chitin subunits. It has been found in the 
proteins listed below. - A number of non-leguminous plant lectins. The best characterized of 
these lectins are the three highly homologous wheat germ agglutinins (WGA-1, 2 and 3). 
WGA is an N-acetylglucosamine/N-acetylneuraminic acid binding lectin which structurally 
consists of a fourfold repetition of the 43 amino acid domain. The same type of structure is 
found in a barley root-specific lectin as well as a rice lectin. - Plants endochitinases (EC 
3.2.1.14 ) from class lA (see <PDOC00620>). Endochitinases are enzymes that catalyze the 
hydrolysis of the beta- 1,4 linkages of N-acetyl glucosamine polymers of chitin. Plant 
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chitinases function as a defense against chitin containing fungal pathogens. Class lA 
chitinases generally contain one copy of the chitin-binding domain at their N-terminal 
extremity. An exception is agglutinin/chitinase [2] from the stinging nettle Urtica dioica 
which contains two copies of the domain. - Hevein [5], a wound-induced protein found in the 
5 latex of rubber trees. - Winl and win2, two wound-induced proteins from potato. - 

Kluyveromyces lactis killer toxin alpha subunit [3]. The toxin encoded by the linear plasmid 
pGKLl is composed of three subunits: alpha, beta, and gamma. The gamma subunit harbors 
toxin activity and inhibits growth of sensitive yeast strains in the Gl phase of the cell cycle; 
the alpha subunit, which is proteolytically processed from a larger precursor that also 

1 0 contains the beta subunit, is a chitinase (see <PDOC00839>). In chitinases, as well as in the 
potato wound-induced proteins, the 43-residuedomain directly follows the signal sequence 
and is therefore at the N-terminal of the mature protein; in the killer toxin alpha subunit it is 
located in the central section of the protein. The domain contains eight conserved cysteine 
residues which have all been shown, in WGA, to be involved in disulfide bonds. The 

1 5 topological arrangement of the four disulfide bonds is shown in the following figure: + 

+ + — I + I j I I I xxCgxxxxxxxCxxxxCCsxxgxCgxxxxxCxxxCxxxxC | 

***=i=n=*|*>f:*********^* I I I I + — ^ + ^.'Q'; conserved cysteine involved in a 

disulfide bond.'*': position of the pattern. 

2 0 -Consensus pattern: C-x(4,5)-C-C-S-x(2)-G-x-C-G-x(4)-[FYW]-C [The five C's are involved 

in disulfide bonds] 

[ 1] Wright H.T., Sandrasegaram G., Wright C.S. J. Mol. Evol. 33:283-294(1991). 
[ 2] Lerner D.R., Raikhel N.V. J. Biol. Chem. 267:11085-11091(1992). 
25 [3] Butler A.R., O'Donnel R.W., Martin V.J., Gooday G.W., Stark M.J.R. Eur. J. Biochem. 
199:483-488(1991). 

115. (Chitinase 1) Chitinases family 19 signatures 

3 0 Chitinases (EC 3.2.1.14 ) [1] are enzymes that catalyze the hydrolysis of thebeta-l,4-N-acetyl- 

D-glucosamine linkages in chitin polymers. From the viewpoint of sequence similarity 
chitinases belong to either family 18 or 19 in the classification of glycosyl hydrolases [2,E1]. 
Chitinases of family 19(also known as classes lA or I and IB or II) are enzymes from plants 
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that function in the defense against fungal and insect pathogens by destroying their chitin- 
containing cell wall. Class lA/I and IB/II enzymes differ in the presence (lA/I) or absence 
(IB/II) of a N-terminal chitin-binding domain (seethe relevant entry <PDOC00025>). The 
catalytic domain of these enzymes consist of about 220 to 230 amino acid residues. Two 
highly conserved regions have been selected as signature patterns, the first one is located in 
the N-terminal section and contains one of the six cysteines which are conserved in most, if 
not all, of these chitinases and which is probably involved in a disulfide bond. 

Consensus pattern: C-x(4,5)-F-Y-[ST]-x(3)-[FY]-[LIVMF]-x-A-x(3)-[YF]-x(2)-F- [GSA] 
Consensus pattern: [LIVM]-[GSA]-F-x-[STAG](2)-[LIVMFY]-W-[FY]-W-[LIVM] 

[ 1] Flach J., Pilet P.-E., Jolles P. Experientia 48:701-716(1992). 
[ 2] Henrissat B. Biochem. J. 280:309-316(1991). 

116. chloroa_b-bind 

Chlorophyll A-B binding proteins. Number of members: 211 

117. chromo 

The 'chromo' (CHRromatin Organization Modifier) domain [1 to 4] is a conserved 
region of about 60 amino acids which was originally found in Drosophila 
modifiers of variegation, which are proteins that modify the structure of 
chromatin to the condensed morphology of heterochromatin, a cytologically 
visible condition where gene expression is repressed. In protein Polycomb, the 
chromo domain has been shown to be important for chromatin targeting. Proteins 
that contains a chromo domain seem to fall into three classes: 

a) Proteins which have a N-terminal chromo domain followed by a region which 
is related to but distinct from the chromo domain and which has been 
termed [3] the 'chromo shadow' domain. 

b) Proteins with a single chromo domain. 

c) Proteins with paired tandem chromo domains. 
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Currently, this domain has been found in the following proteins: 
Class A. 

- Drosophila heterochromatin protein Su{var)205 (HPl). 

- Human heterochromatin protein HPl alpha. 

- Mammalian modifier 1 and modifier 2. 

- Fission yeast swi6, a protein involved in the repression of the silent 
mating- type loci mat2 and mat3. 

Class B. 

- Drosophila protein Polycomb (Pc). 

- Mammalian modifier 3, a homolog of Pc. 

- Drosophila protein Su(var)3-9, a suppressor of position-effect variegation. 

- Human Mi-2 autoantigen, characterisitic of dermatomyosis. 

-Fungal retrotranposon polyproteins: 'skippy' from Fusarium oxysporum, 
'grasshopper' and 'MAGGY' from Magnaporthe grisea and CfT-l from 
Cladosporium fulvum. 

- Fission yeast hypothetical protein SpAC18G6.02c. 

- Caenorhabditis elegans hypothetical protein C29H12.5 

- Caenorhabditis elegans hypothetical protein ZK1236.2. 

- Caenorhabditis elegans hypothetical protein T09A5.8. 

Class C. 

- Mammalian DNA-binding/helicase proteins CHD-1 to CHD-4. 

- Yeast protein CHDl. 

The signature pattern for this domain corresponds to its best conserved section, which ' 
located in its central part. 



-Consensus pattern: [FYL]-x-[LIVMC]-[KR]-W-x-[GDNR]-[FYWLME]-x(5,6)-[ST]-W- 
[ESV]-[PSTDEN]-x(2,3)-[LIVMC] 
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[ 1] Paro R. Trends Genet. 6:416-421(1990). 

[ 2] Singh P.B., Miller J.R., Pearce J., Kothary R., Burton R.D., Paro R., James T.C., Gaunt 
S.J. Nucleic Acids Res. 19:789-794(1991). 

[ 3] Aasland R., Stewart A.F. Nucleic Acids Res. 23:3168-3173(1995). 

[ 4] Koonin E.V., Zhou S., Lucchesis J.C. Nucleic Acids Res. 23:4229-4233(1995). 

118. citrate_synt 

Citrate synthase (EC 4.1.3.7) (CS) is the tricarboxylic acid cycle enzyme that 
catalyzes the synthesis of citrate from oxaloacetate and acetyl-CoA in an 
aldol condensation. CS can directly form a carbon-carbon bond in the absence 
of metal ion cofactors. 

In prokaryotes, citrate synthase is composed of six identical subunits. In 
eukaryotes, there are two isozymes of citrate synthase: one is found in the 
mitochondrial matrix, the second is cytoplasmic. Both seem to be dimers of 
identical chains. 

There are a number of regions of sequence similarity between prokaryotic and 
eukaryotic citrate synthases. One of the best conserved contains a histidine 
which is one of three residues shown [1] to be involved in the catalytic 
mechanism of the vertebrate mitochondrial enzyme. This region has been used as a 
signature pattern. 

-Consensus pattern: G-[FYA]-[GA]-H-x-[IV]-x(l,2)-[RKT]-x(2)-D-[PS]-R [H is an active 
site residue] 

[ 1] Karpusas M., Branchaud B., Remington S.J. Biochemistry 29:2213-2219(1990). 

119. clpA_B 
Chaperonin clpA/B 

CAUTION! This family is a subfamily of the AAA 
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superfamily. The threshold has been set very high to 
stop overlaps vv^ith the AAA superfamily. This 
entry will be subsumed by AAA in the future. 
Number of members: 39 

A number of ATP-binding proteins that are are thought to protect cells from 
extreme stress by controlling the aggregation of denaturation of vital 
cellular structures have been shown [1,2] to be evolutionary related. These 
proteins are listed below. 

-Escherichia coli clpA, which acts as the regulatory subunit of the ATP - 
dependent protease clp. 

- Rhodopseudomonas blastica clpA homolog. 

- Escherichia coli heat shock protein clpB and homologs in other bacteria. 

- Bacillus subtilis protein mecB. 

- Yeast heat shock protein 104 (gene HSP104), which is vital for tolerance to 
heat, ethanol and other stresses. 

- Neurospora heat shock protein hsp98. 

- Yeast mitochondrial heat shock protein 78 (gene HSP78) [3]. 

- CD4A and CD4b, two highly related tomato proteins that seem to be located 
in the chloroplast. 

- Trypanosoma brucei protein clp. 

- Porphyra purpurea chloroplast encoded clpC. 

The size of these proteins range from 84 Kd (clp A) to slightly more than 100 
Kd (HSP104). They all share two conserved regions of about 200 amino acids 
that each contains an ATP-binding site. In addition to the ATP-binding A and 
B motifs there are many parts in these two domains that are also conserved. Two 
of these regions have been selected as signature patterns. The first signature 
is located in the first domain, some ten residues to the C-terminal of the 
ATP-binding B motif. The second pattern is located in the second domain in- 
between the ATP-binding A and B motifs. 
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-Consensus pattern: D-[AI]-[SGA]-N-[LIVMF](2)-K-[PT]-x-L-x(2)-G 

-Consensus pattern: R-[LIVMFY]-D-x-S-E-[LIVMFY]-x-E-[KRQ]-x-[STA]-x-[STA]-[KR]- 
[LIVM]-x-G-[STA] 

[ 1] Gottesman S., Squires C, Pichersky E., Carrington M., Hobbs M., Mattick J.S., 

Dalrymple B., Kuramitsu H., Shiroza T., Foster T., Clark W.P., Ross B., Squires C.L., 

Maurizi M.R. Proc. Natl. Acad. Sci. U.S.A. 87:3513-3517(1990). 

[ 2] Parsell D.A., Sanchez Y., Stitzel J.D., Lindquist S. Nature 353:270-273(1991). 

[ 3] Leonhardt S.A., Fearon K., Danese P.N., Mason T.L. Mol. Cell. Biol. 13:6304- 

6313(1993). 



120. cofilin_ADF 

Cofilin/tropomyosin-type actin-binding proteins 
[1] 

Medline: 97290449 
Structure determination of yeast cofilin. 
Fedorov AA, Lappalainen P, Fedorov EV, Drubin DG, Almo SC; 
Nat Struct Biol 1997;4:366-369. 
[2] 

Medline: 97290450 

Crystal structure of the actin-binding protein actophorin 
from Acanthamoeba. 
Leonard SA, Gittis AG, Petrella EC, Pollard TD, Lattman EE; 

Nat Struct Biol 1997;4:369-373. 

[3] 

Medline: 97420794 

F-actin and G-actin binding are uncoupled by mutation of 
conserved tyrosine residues in maize actin depolymerizing 
factor. 

Jiang CJ, Weeds AG, Khan S, Hussey PJ; 
Proc Natl Acad Sci U S A 1997;94:9973-9978. 
[4] 
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Medline: 97357155 

Cofilin promotes rapid actin filament turnover in vivo. 
Lappalainen P, Drubin DG; 

Nature 1997;388:78-82. 
Severs actin filaments and binds to actin monomers. 
Number of members: 44 

Actin-depolymerizing proteins sever actin filaments (F-actin) and/or bind to 
actin monomers, or G-actin, thus preventing actin-polymerization by 
sequestering the monomers. The following proteins are evolutionary related 
and belong to a family of low molecular weight (137 to 166 residues) actin- 
depolymerizing proteins [1,2,3,4]: 

- Cofilin from vertebrates, slime mold and yeast. Cofilin binds to F-actin 
and acts as a pH-dependent actin-depolymerizing protein. 

- Destrin from vertebrates. Destrin binds to G-actin in a pH-independent 
manner and prevents polymerization. 

- Caenorhabditis elegans unc-60. 

- Acanthamoeba castellanii actophorin. 

- Plants actin depolymerizing factor (ADF). 

The most conserved region of these proteins is a twenty amino-acid segment 
that ends some 30 residues from their C-terminal extremity. This segment has 
been shown [5] to be important for actin-binding. 

-Consensus pattern: P-[DE]-x-[SA]-x-[LIVMT]-[KR]-x-[KR]-M-[LIVM]-[YA]-[STA](3)- 
x(3)-[LIVMF]-[KR] 

[ 1] Hawkins M., Pope B., Maclver S.K., Weeds A.G. Biochemistry 32:9985-9993(1993). 
[ 2] lida K., Moriyama K., Matsumoto S., Kawasaki H., Nishida E., Yahara I. Gene 124:115- 
120(1993). 

[ 3] Quirk S., Maclver S.K., Ampe C, Doberstein S.K., Kaiser D.A., van Damme J., 
Vandekerckhove J., Pollard T.D. Biochemistry 32:8525-8533(1993). 
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[ 4] McKim K.S., Matheson C, Marra M.A., Wakarchuk M.F., Baillie D.L. Mol. Gen. Genet. 
242:346-357(1994). 

[ 5] Moriyama K., Yonezawa N., Sakai H., Yahara I., Nishida E. J. Biol. Chem. 267:7240- 
7244(1992). 

121. (Complex 24kd) Respiratory-chain NADH dehydrogenase 24 Kd subunit signature 
Respiratory-chain NADH dehydrogenase (EC 1.6.5.3 ) [1,2] (also known as complexl or 
NADH-ubiquinone oxidoreductase) is an oligomeric enzymatic complex located in the inner 
mitochondrial membrane which also seems to exist inthe chloroplast and in cyanobacteria (as 
a NADH-plastoquinone oxidoreductase).Among the 25 to 30 polypeptide subunits of this 
bioenergetic enzyme complex there is one with a molecular weight of 24 Kd (in mammals), 
which is a component of the iron-sulfur (IP) fragment of the enzyme. It seems to bind a2Fe- 
2S iron-sulfur cluster. The 24 Kd subunit is nuclear encoded, as aprecursor form with a 
transit peptide in mammals, and in Neurospora crassa.The 24 Kd subunit is highly similar to 
[3,4]: - Subunit E of Escherichia coli NADH-ubiquinone oxidoreductase (gene nuoE). - 
Subunit NQ02 of Paracoccus denitrificans NADH-ubiquinone oxidoreductase. A highly 
conserved region, located in the central section of this subunit containing two conserved 
cysteines that are probably involved in the binding of the 2Fe-2S center has been selected as a 
signature pattern. 

-Consensus pattern: D-x(2)-F-[ST]-x(5)-C-L-G-x-C-x(2) [GA]-P [The two C's are putative 
2Fe-2S ligands] 

[ 1] Ragan C.I. Curr. Top. Bioenerg. 15:1-36(1987). 

[ 2] Weiss H., Friedrich T., Hofhaus G., Preis D. Eur. J. Biochem. 197:563-576(1991). 

[ 3] Fearnley I.M., Walker J.E. Biochim. Biophys. Acta 1140:105-134(1992). 

[ 4] Weidner U., Geier S., Ptock A., Friedrich T., Leif H., Weiss H. J. Mol. Biol. 233:109- 

122(1993). 

122. copper-bind 

Copper binding proteins, plastocyanin/azurin family 
Number of members: 70 
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Blue or 'type-1' copper proteins are small proteins which bind a single 
copper atom and which are characterized by an intense electronic absorption 
band near 600 nm [1,2]. The most well known members of this class of proteins 
are the plant chloroplastic plastocyanins, which exchange electrons with 
cytochrome c6, and the distantly related bacterial azurins, which exchange 
electrons with cytochrome c551. This family of proteins also includes all the 
proteins listed below (references are only provided for recently determined 
sequences). 

- Amicyanin from bacteria such as Methylobacterium extorquens or Thiobacillus 
versutus that can grow on methylamine. Amicyanin appears to be an electron 
receptor for methylamine dehydrogenase. 

- Auracyanins A and B from Chloroflexus aurantiacus [3]. These proteins can 
donate electrons to cytochrome c-554. 

- Blue copper protein from Alcaligenes faecalis. 

- Cupredoxin (CPC) from cucumber peelings [4]. 

- Cusacyanin (basic blue protein; plantacyanin, CBP) from cucumber. 

- Halocyanin from Natrobacterium pharaonis [5], a membrane associated copper- 
binding protein. 

- Pseudoazurin from Pseudomonas. 

- Rusticyanin from Thiobacillus ferrooxidans. Rusticyanin is an electron 
carrier from cytochrome c-552 to the a-type oxidase [6]. 

- Stellacyanin from the Japanese lacquer tree. 

- Umecyanin from horseradish roots. 

- Allergen Ra3 from ragweed. This pollen protein is evolutionary related to 
the above proteins, but seems to have lost the ability to bind copper. 

Although there is an appreciable amount of divergence in the sequence of all 

these proteins, the copper ligand sites are conserved and a pattern which includes two 

of the ligands (a cysteine and a histidine) has been developed. 
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-Consensus pattern: [GA]-x(0,2)-[YSA]-x(0,l)-[VFY]-x-C-x(l,2)-[PG]-x(0,l)-H-x(2,4)- 
[MQ] [C and H are copper ligands] 

[ 1] Garret T.P.J., Clingeleffer D.J., Guss J.M., Rogers SJ., Freeman H.C. J. Biol. Chem. 
259:2822-2825(1984). 

[ 2] Ryden L.G., Hunt L.T. J. Mol. Evol. 36:41-66(1993). 

[ 3] McManus J.D., Brune D.C., Han J., Sanders-Loehr J., Meyer T.E., Cusanovich M.A., 
Tollin G., Blankenship R.E. J. Biol. Chem. 267:6531-6540(1992). 

[ 4] Mann K., Schaefer W., Thoenes U., Messerschmidt A., Melirabian Z., Nalbandyan R. 
FEBS Lett. 314:220-223(1992). 

[ 5] Mattar S., Scharf B., Kent S.B.H., Rodewald K., Oesterhelt D., Engelhard M. J. Biol. 
Chem. 269:14939-14945(1994). 

[ 6] Yano T., Fukumori Y., Yamanaka T. FEBS Lett. 288:159-162(1991). 
123. Chaperonins cpnlO signature 

Chaperonins [1,2] are proteins involved in the folding of proteins or the assembly of 
oligomeric protein complexes. They seem to assist other polypeptides in maintaining or 
assuming conformations which permit their correct assembly into oligomeric structures. They 
are found in abundance in prokaryotes, chloroplasts and mitochondria. Chaperonins form 
oligomeric complexes and are composed of two different types of subunits: a 60 Kd protein, 
known as cpn60 (groEL in bacteria) and a 10 Kd protein, known ascpnlO (groES in 
bacteria).The cpnlO protein binds to cpn60 in the presence of MgATP and suppresses the 
ATPase activity of the latter. CpnlO is a protein of about 100 amino acid residues whose 
sequence is well conserved in bacteria, vertebrate mitochondriaand plants chloroplast [3,4]. 
CpnlO assembles as an heptamer that forms a dome[5]. As a signature pattern for cpnlO , a 
region located in the N-terminal section of the protein was selected. 

Consensus pattern: [LIVMFY]-x-P-[ILT]-x-[DEN]-[KR]-[LIVMFA](3)-[KREQ]-x(8,9)- 
[SG]-x-[LIVMFY](3)- 

Note: this pattern is found twice in the plant chloroplast protein which consist of the tandem 
repeat of a cpnlO domain 
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[ 1] Ellis R.J., van der Vies S.M. Annu. Rev. Biochem. 60:321-347(1991). 

[ 2] Zeilsta-Ryalls J., Fayet O., Georgopoulos C. Annu. Rev. Microbiol. 45:301-325(1991). 

[ 3] Hartman D.J., Hoogenraad N.J., Condron R., Hoj P.B. Proc. Natl. Acad. Sci. U.S.A. 

89:3394-3398(1992). 

[ 4] Bertsch U., Soil J., Seetharam R., Viitanen P.V. Proc. Natl. Acad. Sci. U.S.A. 89:8696- 
8700(1992). 

[ 5] Hunt J.F., Weaver A.J., Landry S.J., Gierasch L., Deisenhofer J. Nature 379:37-45(1996). 
124. Chaperonins cpn60 signature (cpn60_TCPl) 

Chaperonins [1,2] are proteins involved in the folding of proteins or the assembly of 
oligomeric protein complexes. Their role seems to be to assist other polypeptides to maintain 
or assume conformations which permit their correct assembly into oligomeric structures. 
They are found in abundance in prokaryotes, chloroplasts and mitochondria. Chaperonins 
form oligomeric complexes and are composed of two different types of subunits: a 60 Kd 
protein, known as cpn60 (groEL in bacteria) and a 10 Kd protein, known as cpnlO (groES in 
bacteria),The cpn60 protein shows weak ATPase activity and is a highly conserved protein of 
about 550 to 580 amino acid residues which has been described by different names in 
different species: - Escherichia coli groEL protein, which is essential for the growth of the 
bacteria and the assembly of several bacteriophages. - Cyanobacterial groEL analogues. - 
Mycobacterium tuberculosis and leprae 65 Kd antigen, Coxiella burnetti heat shock protein B 
(gene htpB), Rickettsia tsutsugamushi major antigen 58, and Chlamydial 57 Kd 
hypersensitivity antigen (gene hypB). - Chloroplast RuBisCO subunit binding-protein alpha 
and beta chains, which bind ribulose bisphosphate carboxylase small and large subunits and 
are implicated in the assembly of the enzyme oligomer. - Mammalian mitochondrial matrix 
protein PI (mitonin or P60). - Yeast HSP60 protein, a mitochondrial assembly factor. As a 
signature pattern for these proteins, a rather well-conserved region of twelve residues, located 
in the last third of the cpn60sequence was chosen. 

Consensus pattern: A-[AS]-x-[DEQ]-E-x(4)-G-G-[GA]- 

[ 1] Ellis R.J., van der Vies S.M. Annu. Rev. Biochem. 60:321-347(1991). 

[ 2] Zeilsta-Ryalls J., Fayet O., Georgopoulos C. Annu. Rev. Microbiol. 45:301-325(1991). 
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Chaperonins TCP-1 signatures (cpn60_TCPl) 

The TCP-1 protein [1,2] (Tailless Complex Polypeptide 1) was first identified in mice where 
it is especially abundant in testis but present in all cell types. It has since been found and 
characterized in many other mammalian species, in Drosophila and in yeast. TCP-1 is a 
highly conserved protein of about 60 Kd (556 to 560 residues) which participates in a hetero- 
oligomeric900 Kd double-torus shaped particle [3] with 6 to 8 other different subunits. These 
subunits, the chaperonin containing TCP-1 (CCT) subunit beta, gamma,delta, epsilon, zeta 
and eta are evolutionary related to TCP-1 itself [4,5]. The CCT is known to act as a molecular 
chaperone for tubulin, actin and probably some other proteins. The CCT subunits are highly 
related to archebacterial counterparts: - TF55 and TF56 [6], a molecular chaperone from 
Sulf Globus shibatae. TF55 has ATPase activity, is known to bind unfolded polypeptides and 
forms a oligomeric complex of two stacked nine-membered rings. - Thermosome [7], from 
Thermoplasma acidophilum. The thermosome is composed of two subunits (alpha and beta) 
and also seems to be a chaperone with ATPase activity. It forms an oligomeric complex of 
eight-membered rings. The TCP-1 family of proteins are weakly, but significantly [8], related 
to thecpn60/groEL chaperonin family (see <PDOC00268>).As signature patterns of this 
family of chaperonins, three conserved regions located in the N-terminal domain were 
chosen. 

Consensus pattern: [RKEL]-[ST]-x-[LMFY]-G-P-x-[GSA]-x-x-K-[LIVMF](2)- 
Consensus pattern: [LIVM]-[TS]-[NK]-D-[GA]-[AVNHK]-[TAV]-[LIVM](2)-x(2)- 
[LIVM]-x-[LIVM]-x-[SNH]-[PQH]- 

Consensus pattern: Q-[DEK]-x-x-[LIVMGTA]-[GA]-D-G-T- 

[ 1] Ellis J. Nature 358:191-192(1992). 

[ 2] Nelson R.J., Craig E.A. Curr. Biol. 2:487-489(1992). 

[ 3] Lewis V.A., Hynes G.M., Zheng D., Saibil H., WiUison K.R. Nature 358:249-252(1992). 
[ 4] Kubota H., Hynes G., Carne A., Ashworth A., Willison K.R. Curr. Biol. 4:89-99(1994) 
[ 5] Kim S., Willison K.R., Horwich A.L. Trends Biochem. Sci. 20:543-548(1994). 
[ 6] Trent J.D., Nimmesgern E., Wall J.S., Hartl F.U., Horwich A.L. Nature 354:490- 
493(1991). 
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[ 7] Waldmann T., Lupas A., Kelleimann J., Peters J., Baumeister W. Biol. Chem. Hoppe 
Seyler 376:119-126(1995). 

[ 8] Hemmingsen S.M. Nature 357:650-650(1992). 

125. cyclin (Cyclins) 
The cyclins include an internal duplication, which is related 
to that found in TFIIB and the RB protein. 
[1] 

Medline: 94203808 

Evidence for a protein domain superfamily shared by the cyclins, 
TFIIB and RB/pl07. 
Gibson TJ, Thompson JD, Blocker A, Kouzarides T; 
Nucleic Acids Res 1994;22:946-952. 
[2] 

Medline: 96164440 
The crystal structure of cyclin A 
Brown NR, Noble MEM, Endicott JA, Garman EF, Wakatsuki S, 
Mitchell E, Rasmussen B, Hunt T, Johnson LN; 

Structure. 1995;3:1235-1247. 
Complex of cyclin and cyclin dependant kinase. 
[3] 

Medline: 96313126 

Structural basis of cyclin-dependant kinase activation by 
phosphorylation. 

Russo AA, Jeffrey PD, Pavletich NP; 
Nat Struct Biol. 1996;3:696-700. 

Cyclins regulate cyclin dependant kinases (CDKs). 

The most divergent prosite members have been included. Swiss:P22674 

the Uracil-DNA glycosylase 2 is the highest noise and may be related 

but has not been included. 
Number of members: 189 
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Cyclins [1,2,3] are eukaryotic proteins which play an active role in 
controlling nuclear cell division cycles. Cyclins, together with the p34 
(cdc2) or cdk2 kinases, form the Maturation Promoting Factor (MPF). There are 
two main groups of cyclins: 

5 

- G2/M cyclins, essential for the control of the cell cycle at the G2/M 
(mitosis) transition. G2/M cyclins accumulate steadily during G2 and are 
abruptly destroyed as cells exit from mitosis (at the end of the M-phase). 

- Gl/S cyclins, essential for the control of the cell cycle at the Gl/S 
10 (start) transition. 

In most species, there are multiple forms of Gl and G2 cyclins. For example, 
in vertebrates, there are two G2 cyclins, A and B, and at least three Gl 
cyclins, C, D, and E. 

15 

A cyclin homolog has also been found in herpesvirus saimiri [4]. 

The best conserved region is in the central part of the cyclins' sequences, 
known as the 'cyclin-box'. From this, a 32 residue pattern has been derived. 

20 

-Consensus pattern: R-x(2)-[LIVMSA]-x(2)-[FYWS]-[LIVM]-x(8)-[LIVMFC]-x(4)- 

[LIVMFYA]-x(2)-[STAGC]-[LIVMFYQ]-x-[LIVMFYC]-[LIVMFY]-D-[RKH]- 

[LIVMFYW] 

25 [1] Nurse P. Nature 344:503-508(1990). 

[ 2] Norbury C, Nurse P. Curr. Biol. 1:23-24(1991). 

[ 3] Lew D.J., Reed S.I. Trends Cell Biol. 2:77-81(1992). 

[ 4] Nicholas J., Cameron K.R., Honess R.W. Nature 355:362-365(1992). 

30 

126. Cystatin domain 

This is a very diverse family. Attempts to define separate subfamilies have failed. Typically, 
either the N-terminal or C-terminal end is very divergent. But splitting into two domains 
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would make very short families. Cathelicidins are related to this family but have not been 
included. Number of members: 147 

Inhibitors of cysteine proteases [1,2,3], which are found in the tissues and body fluids 
of animals, in the larva of the worm Onchocerca volvulus [4], as well as in plants, can be 
grouped into three distinct but related families: 

- Type 1 cystatins (or stefins), molecules of about 100 amino acid residues with 
neither disulfide bonds nor carbohydrate groups. 

- Type 2 cystatins, molecules of about 115 amino acid residues which contain one 
or two disulfide loops near their C-terminus. 

Kininogens, which are multifunctional plasma glycoproteins. 

They are the precursor of the active peptide bradykinin and play a role in blood 
coagulation by helping to position optimally prekallikrein and factor XI next to factor XIL 
They are also inhibitors of cysteine proteases. Structurally, kininogens are made of three 
contiguous type-2 cystatin domains, followed by an additional domain (of variable length) 
which contains the sequence of bradykinin. The first of the three cystatin domains seems to 
have lost its inhibitory activity. 

In all these inhibitors, there is a conserved region of five residues which has been 
proposed to be important for the binding to the cysteine proteases. The consensus pattern 
starts one residue before this conserved region. 

-Consensus pattern: [GSTEQKRV]-Q-[LIVT]-[VAF]-[SAGQ]-G-x-[LIVMNK]-x(2)- 
[LIVMFY]-x-[LIVMFYA]-[DENQKRHSIV] 

[1] Barrett A.J. Trends Biochem. Sci. 12:193-196(1987). 

[2] Rawlings N.D., Barrett A.J. J. Mol. Evol. 30:60-71(1990). 

[3] Turk v., Bode W. FEES Lett. 285:213-219(1991). 

[4] Lustigman S., Brotman B., Huima T., Prince A.M. Mol. Biochem. Parasitol. 45:65- 
76(1991). 

127. cytochrome_c (Cytochrome c) 
The Pfam entry does not include all prosite members. 
The cytochrome 556 and cytochrome c' families are 
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not included. 
Number of members: 259 

In proteins belonging to cytochrome c family [1], the heme group is covalently 
attached by thioether bonds to two conserved cysteine residues. The consensus 
sequence for this site is Cys-X-X-Cys-His and the histidine residue is one of 
the two axial ligands of the heme iron. This arrangement is shared by all 
proteins known to belong to cytochrome c family, which presently includes 
cytochromes c, c', cl to c6, c550 to c556, cc3/Hmc, cytochrome f and reaction 
center cytochrome c. 

-Consensus pattern: C-{CPWHF}-{CPWR}-C-H-{CFYW} 
[ 1] Mathews F.S. Prog. Biophys. Mol. Biol. 45:1-56(1985). 

128. (DAGKa) Diacylglycerol kinase accessory domain (presumed) 

Diacylglycerol (DAG) is a second messenger that acts as a protein kinase C activator. 
This domain is assumed to be an accessory domain: its function is unknown. 

[1] Sakane F, Yamada K, Kanoh H, Yokoyama C, Tanabe T, Nature 1990;344:345- 
348.[2] Sakane F, Imai S, Kai M, Wada I, Kanoh H, J Biol Chem 1996;271:8394-8401. [3] 
Schaap D, de Widt J, van der Wal J, Vandekerckhove J, van. Damme J, Gussow D, Ploegh 
HL, van Blitterswijk WJ, van der, Bend RL, FEES Lett 1990;275:151-158. [4] Kanoh H, 
Yamada K, Sakane F, Trends Biochem Sci 1990;15:47-50. 

129. (DAGKc) Diacylglycerol kinase catalytic domain (presumed) 

Diacylglycerol (DAG) is a second messenger that acts as a protein kinase C activator. 
The catalytic domain is assumed from the finding of bacterial homologues. 

[1] Sakane F, Yamada K, Kanoh H, Yokoyama C, Tanabe T, Nature 1990;344:345- 
348. [2] Sakane F, Imai S, Kai M, Wada I, Kanoh H, J Biol Chem 1996;271:8394-8401. [3] 
Schaap D, de Widt J, van der Wal J, Vandekerckhove J, van. Damme J, Gussow D, Ploegh 
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HL, van Blitterswijk WJ, van der, Bend RL, FEES Lett 1990;275:151-158. [4] Kanoh H, 
Yamada K, Sakane F, Trends Biochem Sci 1990;15:47-50. 

130. D-amino acid oxidases signature(DAO) 

D-amino acid oxidase (EC 1.4.3.3 ) (DAMOX or DAO) is an FAD flavoenzyme that catalyzes 
the oxidation of neutral and basic D-amino acids into their corresponding keto acids. DAOs 
have been characterized and sequenced in fungi and vertebrates where they are known to be 
located in the peroxisomes. D-aspartate oxidase (EC 1.4.3.1) (DASOX) [1] is an enzyme, 
structurally related to DAO, which catalyzes the same reaction but is active only toward 
dicarboxylic D-amino acids. In DAO, a conserved histidine has been shown [2] to be 
important for the enzyme's catalytic activity. The conserved region around this residue has 
been developed as a signature pattern for these enzymes. 

Consensus pattern; [LIVM](2)-H-[NHA]-Y-G-x-[GSA](2)-x-G-x(5)-G-x-A [H is a probable 
active site residuejo- 

[ 1] Negri A., Ceciliani F., Tedeschi G., Simonic T., Ronchi S. J. Biol. Chem. 267:11865- 
11871(1992). 

[ 2] Miyano M., Fukui K., Watanabe F., Takahashi S., Tada M., Kanashiro M., Miyake Y. J. 
Biochem. 109:171-177(1991). 

131. DEAD and DEAH box families ATP-dependent helicases signatures 
A number of eukaryotic and prokaryotic proteins have been characterized [1,2,3] on the basis 
of their structural similarity. They all seem to be involved in ATP-dependent, nucleic-acid 
unwinding. Proteins currently known to belong to this family are: - Initiation factor eIF-4A. 
Found in eukaryotes, this protein is a subunit of a high molecular weight complex involved in 
5'cap recognition and the binding of mRNA to ribosomes. It is an ATP-dependent RNA- 
helicase. - PRP5 and PRP28. These yeast proteins are involved in various ATP-requiring 
steps of the pre-mRNA splicing process. - PllO, a mouse protein expressed specifically 
during spermatogenesis. - An3, a Xenopus putative RNA helicase, closely related to PllO. - 
SPP81/DED1 and DBPl, two yeast proteins probably involved in pre-mRNA splicing and 
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related to PllO. - Caenorhabditis elegans helicase glh-1. - MSS116, a yeast protein required 
for mitochondrial splicing. - SPB4, a yeast protein involved in the maturation of 25S 
ribosomal RNA. - p68, a human nuclear antigen. p68 has ATPase and DNA-helicase 
activities in vitro. It is involved in cell growth and division. - Rm62 (p62), a Drosophila 
putative RNA helicase related to p68. - DBP2, a yeast protein related to p68. - DHHl, a yeast 
protein. - DRSl, a yeast protein involved in ribosome assembly. - MAK5, a yeast protein 
involved in maintenance of dsRNA killer plasmid. - ROKl, a yeast protein. - stel3, a fission 
yeast protein. - Vasa, a Drosophila protein important for oocyte formation and specification 
of embryonic posterior structures. - MeSlB, a Drosophila maternally expressed protein of 
unknown function. - dbpA, an Escherichia coli putative RNA helicase. - deaD, an Escherichia 
coli putative RNA helicase which can suppress a mutation in the rpsB gene for ribosomal 
protein S2. - rhlB, an Escherichia coli putative RNA helicase. - rhlE, an Escherichia coli 
putative RNA helicase. - srmB, an Escherichia coli protein that shows RNA-dependent 
ATPase activity. It probably interacts with 23S ribosomal RNA. - Caenorhabditis elegans 
hypothetical proteins T26G10.1, ZK512.2 and ZK686.2. - Yeast hypothetical protein 
YHR065C. - Yeast hypothetical protein YHR169w. - Fission yeast hypothetical protein 
SpAC31A2.07c. - Bacillus subtilis hypothetical protein yxiN. All these proteins share a 
number of conserved sequence motifs. Some of them are specific to this family while others 
are shared by other ATP -binding proteins or by proteins belonging to the helicases 
^superfamily' [4,E1]. One of these motifs, called the 'D-E-A-D-box', represents a special 
version of the B motif of ATP -binding proteins. Some other proteins belong to a subfamily 
which have His instead of the second Asp and are thus said to be 'D-E-A-H-box' proteins 
[3,5,6,11]. Proteins currently known to belong to this subfamily are: - PRP2, PRP16, PRP22 
and PRP43. These yeast proteins are all involved in various ATP-requiring steps of the pre- 
mRNA splicing process. - Fission yeast prhl, which my be involved in pre-mRNA splicing. - 
Male-less (mle), a Drosophila protein required in males, for dosage compensation of X 
chromosome linked genes. - RAD3 from yeast. RAD3 is a DNA helicase involved in excision 
repair of DNA damaged by UV light, bulky adducts or cross-linking agents. Fission yeast 
radl5 (rhp3) and mammalian DNA excision repair protein XPD (ERCC-2) are the homologs 
of RAD3. - Yeast CHLl (or CTFl), which is important for chromosome transmission and 
normal cell cycle progression in G(2)/M. - Yeast TPSl. - Yeast hypothetical protein 
YKL078W. - Caenorhabditis elegans hypothetical proteins C06E1.10 and K03H1.2. - 
Poxviruses' early transcription factor 70 Kd subunit which acts with RNA polymerase to 
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initiate transcription from early gene promoters. - 18, a putative vaccinia virus helicase. - 
hrpA, an Eschericliia coli putative RNA helicase. Signature patterns for both subfamilies 
were developed. 

Consensus pattern: [LIVMF](2)-D-E-A-D-[RKEN]-x-[LIVMFYGSTN 
Consensus pattern: [GSAH]-x-[LIVMF](3)-D-E-[ALIV]-H-[NECR] 

Note: proteins belonging to this family also contain a copy of the ATP/GTP- binding motif 
A' (P-loop) (see the relevant entry < PDOC00017 

[ 1] Schmid S.R., Under P. Mol. Microbiol. 6:283-292(1992). 

[ 2] Linder P., Lasko P., Ashburner M., Leroy P., Nielsen PJ., Nishi K., Schnier J., Slonimski 
P.P. Nature 337:121-122(1989). 

[ 3] Wassarman D.A., Steitz J.A. Nature 349:463-464(1991). 

[ 4] Hodgman T.C. Nature 333:22-23(1988) and Nature 333:578-578(1988) (Errata). 
[ 5] Harosh 1., Deschavanne P. Nucleic Acids Res. 19:6331-6331(1991). 
[ 6] Koonin E.V., Senkevich T.G. J. Gen. Virol. 73:989-993(1992). 

132. (DHBP_synthase) 3,4-dihydroxy-2-butanone 4-phosphate synthase 

3,4-Dihydroxy-2-butanone 4-phosphate is biosynthesized from ribulose 5-phosphate 
and serves as the biosynthetic precursor for the xylene ring of riboflavin. Sometimes found as 
a bifunctional enzyme with GTP cvclohvdro2 . 

Richter G, Krieger C, Volk R, Kis K, Ritz H, Gotze E, Bacher A, Methods Enzymol 
1997;280:374-382. 

133. (DHDPS) Dihydrodipicolinate synthetase signatures 

Dihydrodipicolinate synthetase (EC 4.2.1.52 ) (DHDPS) [1] catalyzes, in higher plants 
chloroplast and in many bacteria (gene dapA), the first reaction specific to the biosynthesis of 
lysine and of diaminopimelate. DHDPS is responsible for the condensation of aspartate 
semialdehyde and pyruvate by aping-pong mechanism in which pyruvate first binds to the 
enzyme by forming a Schiff-base with a lysine residue. Three other proteins are structurally 
related to DHDPS and probably also act via a similar catalytic mechanism: - Escherichia coli 
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N-acetylneuraminate lyase (EC 4.1.3.3 ) (gene nanA), which catalyzes the condensation of N- 
acetyl-D-mannosamine and pyruvate to form N-acetylneuraminate. - Rhizobium meliloti 
protein mosA [3], which is involved in the biosynthesis of the rhizopine 3-o-methyl-scyllo- 
inosamine. - Escherichia coli hypothetical protein yjhH. Two signature patterns for these 
enzymes were developed . The first one is centered on highly conserved region in the N- 
terminal part of these proteins. The second signature contains a lysine residue which has been 
shown, in Escherichia coli dapA [2], to be the one that forms a Schiff-base with the substrate. 

Consensus pattern: [GSA]-[LIVM]-[LIVMFY]-x(2)-G-[ST]-[TG]-G-E-[GASNF]-x(6)- [EQ] 

Consensus pattern: Y-[DNS]-[LIVMFA]-P-x(2)-[ST]-x(3)-[LIVMG]-x(13,14)-[LIVM]- x- 
[SGA]-[LIVMF]-K-[DEQAF]-[STAC] [K is involved in Schiff-base formation]- 

[ 1] Kaneko T., Hashimoto T., Kumpaisal R., Yamada Y. J. Biol. Chem. 265:17451- 
17455(1990). 

[ 2] Laber B., Gomis-Rueth F.-X., Romao M.J., Huber R. Biochem. J. 288:691-695(1992). 
[ 3] Murphy P.J., Trenz S.P., Grzemski W., de Bruijn F.J., Schell J. J. Bacteriol. 175:5193- 
5204 (1993). 

134. (DHOdehase) Dihydroorotate dehydrogenase signatures 

Dihydroorotate dehydrogenase (EC 1.3.3.1) (DHOdehase) catalyzes the fourth step in the de 
novo biosynthesis of pyrimidine, the conversion of dihydroorotate into orotate. DHOdehase 
is a ubiquitous FAD flavoprotein. In bacteria (gene pyrD), DHOdease is located on the inner 
side of the cytosolic membrane. In some yeasts, such as in Saccharomyces cerevisiae (gene 
URAl), it is a cytosolic protein while in other eukaryotes it is found in the mitochondria [1]. 
The sequence of DHOdease is rather well conserved and two signature patterns were 
developed specific to this enzyme. The first corresponds to a region in the N-terminal section 
of the enzyme while the second is located in the C-terminal section and seems to be part of 
the FAD-binding domain. 



Consensus pattern[GS]-x(4)-[GK]-[GSTA]-[LIVFSTA]-[GT]-x(3)-[NQR]-x-G-[NHY]-x(2)- 
F-[RT] 
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Consensus pattern[LIVM](2)-[GSA]-x-G-G-[IV]-x-[STGDN]-x(3)-[ACV]-x(6)-G-A 

[ 1] Nagy M., Lacroute F., Thomas D. Proc. Natl. Acad. Sci. U.S.A. 89:8966-8970(1992). 

135. (DMRL_synthase) 6,7-dimethyl-8-ribityllumazine synthase 

136. (DNA_methylase) C-5 cytosine-specific DNA methylases signatures 

C-5 cytosine-specific DNA methylases (EC 2.1.1.73 ) (C5 Mtase) are enzymes that 
specifically methylate the C-5 carbon of cytosines in DNA [1,2,3]. Such enzymes are found 
in the proteins described below. - As a component of type II restriction-modification systems 
in prokaryotes and some bacteriophages. Such enzymes recognize a specific DNA sequence 
where they methylate a cytosine. In doing so, they protect DNA from cleavage by type II 
restriction enzymes that recognize the same sequence. The sequences of a large number of 
type II C-5 Mtases are known. - In vertebrates, there are a number of C-5 Mtases that 
methylate CpG dinucleotides. The sequence of the mammalian enzyme is known.C-5 Mtases 
share a number of short conserved regions. Two of them were selected. The first is centered 
around a conserved Pro-Cys dipeptide in which the cysteine has been shown [4] to be 
involved in the catalytic mechanism; it appears to form a covalent intermediate with the C6 
position of cytosine. The second region is located at the C-terminal extremity in type-II 
enzymes 

Consensus pattern: [DENKS]-x-[FLIV]-x(2)-[GSTC]-x-P-C-x(2)-[FYWLIM]-S [C is the 
active site residue]- 

Consensus pattern: [RKQGTF]-x(2)-G-N-[STAG]-[LIVMF]-x(3)-[LIVMT]-x(3)-[LIVM]- 
x(3)-[LIVM]- 

[ 1] Posfai J., Bhagwat A.S., Roberts R.J. Gene 74:261-263(1988). 

[ 2] Kumar S., Cheng X., Klimasauskas S., Mi S., Posfai J., Roberts R.J., Wilson G.G. 

Nucleic Acids Res. 22:1-10(1994). 

[ 3] Lauster R., Trautner T.A., Noyer-Weidner M. J. Mol. Biol. 206:305-312(1989). 
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[ 4] Chen L., McMillan A.M., Chang W., Ezak-Nipkay K., Lane W.S., Verdine G.L. 
Biochemistry 30: 11018-1 1025(1 991 ). 

137. (DNAphotolyase) DNA photolyases class 2 signatures 

Deoxyribodipyrimidine photolyase (EC 4.1.99.3 ) (DNA photolyase) [1,2] is a DNArepair 
enzyme. It binds to UV-damaged DNA containing pyrimidine dimers and, upon absorbing a 
near-UV photon (300 to 500 nm), breaks the cyclobutane ring joining the two pyrimidines of 
the dimer. DNA photolyase is an enzyme that requires two choromophore-cofactors for its 
activity: a reduced FADH2 and either 5,10-methenyltetrahydrofolate (5,10-MTFH) or an 
oxidized 8-hydroxy-5-deazaflavin (8-HDF) derivative (F420). The folate or deazaflavin 
chromophore appears to function as an antenna, while the FADH2 chromophore is thought to 
be responsible for electron transfer. On the basis of sequence similarities[3] DNA 
photolyases can be grouped into two classes. The second class contains enzymes from 
Myxococcus xanthus, methanogenic archaebacteria, insects, fish and marsupial mammals. It 
is not yet known what second cof actor is bound to class 2 enzymes. There are a number of 
conserved sequence regions in all known class 2 DNAphotolyases, especially in the C- 
terminal part. Two of these regions were selected as signature patterns. 
Consensus pattern: F-x-E-E-x-[LIVM](2)-R-R-E-L-x(2)-N-F- 

Consensus pattern: G-x-H-D-x(2)-W-x-E-R-x-[LIVM]-F-G-K-[LIVM]-R-[FY]-M-N- 

[ 1] Sancar G.B., Sancar A. Trends Biochem. Sci. 12:259-261(1987). 
[ 2] Jorns M.S. Biofactors 2:207-211(1990). 

[ 3] Yasui A., Eker A.P.M., Yasuhira S., Yajima H., Kobayashi T., Takao M., Oikawa A. 
EMBO J. 13:6143-6151(1994). 

(DNAphotolyase2) DNA photolyases class 1 signatures 

Deoxyribodipyrimidine photolyase (EC 4.1.99.3 ) (DNA photolyase) [1,2] is a DNA repair 
enzyme. It binds to UV-damaged DNA containing pyrimidine dimers and ,upon absorbing a 
near-UV photon (300 to 500 nm), breaks the cyclobutane ring joining the two pyrimidines of 
the dimer. DNA photolyase is an enzyme that requires two choromophore-cofactors for its 
activity: a reduced FADH2 and either 5,10-methenyltetrahydrofolate (5,10-MTFH) or an 
oxidized 8-hydroxy-5 -deazaflavin (8-HDF) derivative (F420). The folate or deazaflavin 
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chromophore appears to function as an antenna, while the FADH2 chromophore is thought to 
be responsible for electron transfer. On the basis of sequence similarities[3] DNA 
photolyases can be grouped into two classes. The first class contains enzymes from Gram- 
negative and Gram-positive bacteria, the halophilic archaebacteria Halobacterium halobium, 
fungi and plants. Class 1 enzymes bind either 5,10-MTHF (E.coli, fungi, etc.) or 8-HDF 
(S.griseus, H.halobium).This family also includes Arabidopsis cryptochromes 1 (CRYl) and 
2 (CRY2),which are blue light photoreceptors that mediate blue light-induced gene 
expression. There are a number of conserved sequence regions in all known class 1 DNA 
photolyases, especially in the C-terminal part. Two of these regions were selected as 
signature patterns 

Consensus pattern: T-G-x-P-[LIVM](2)-D-A-x-M-[RA]-x-[LIVM]- 

Consensus pattern: [DN]-R-x-R-[LIVM](2)-x-[STA](2)-F-[LIVMFA]-x-K-x-L-x(2,3)- W- 
[KRQ]- 

[ 1] Sancar G.B., Sancar A. Trends Biochem. Sci. 12:259-261(1987). 
[ 2] Jorns M.S. Biofactors 2:207-211(1990). 

[ 3] Yasui A., Eker A.P.M., Yasuhira S., Yajima H., Kobayashi T., Takao M., Oikawa A. 
EMBO J. 13:6143-6151(1994). 

[ 4] Lin C, Ahmad M., Cashmore A.R. Plant J. 10:893-902(1996). 

138. (DNA_pol_A) 

DNA polymerase family A signature 

Replicative DNA polymerases (EC 2.7.7.7) are the key enzymes catalyzing the accurate 
replication of DNA. They require either a small RNA molecule or a protein as a primer for 
the de novo synthesis of a DNA chain. On the basis of sequence similarities a number of 
DNA polymerases have been grouped together [1,2,3] under the designation of DNA 
polymerase family A. The polymerases that belong to this family are listed below. 

- Escherichia coli and various other bacterial polymerase I (gene polA). 

- Thermus aquaticus Taq polymerase. 

- Bacteriophage spOl polymerase. 
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- Bacteriophage sp02 polymerase. 

- Bacteriophage T5 polymerase. 

- Bacteriophage T7 polymerase. 

- Mycobacteriophage L5 polymerase. 

- Yeast mitochondrial polymerase gamma (gene MIPl). 

Five regions of similarity are found in all the above polymerases. One of these conserved 
regions, known as 'motif B' [1], is located in a domain which, in Escherichia coli polA, has 
been shown to bind deoxynucleotide triphosphate substrates; it contains a conserved tyrosine 
which has been shown, by photo- affinity labelling, to be in the active site; a conserved 
lysine, also part of this motif, can be chemically labelled, using pyridoxal phosphate. This 
conserved region was used as a signature for this family of DNA polymerases. 

Consensus patternR-x(2)-[GSAV]-K-x(3)-[LIVMFY]-[AGQ]-x(2)-Y-x(2)-[GS]-x(3)- 
[LIVMA] Sequences known to belong to this class detected by the pattern ALL. 

[ 1] Delarue M., Poch O., Todro N., Moras D., Argos P. Protein Eng. 3:461-467(1990). 
[ 2] Ito J., Braithwaite D.K. Nucleic Acids Res. 19:4045-4057(1991). 
[ 3] Braithwaite D.K., Ito J. Nucleic Acids Res. 21:787-802(1993). 

139. DNAjpol_viral_C 

DNA polymerase (viral) C-terminal domain 
Number of members: 128 

140. (DNAjopoisoII) 

DNA topoisomerase II signature 

DNA topoisomerase I (EC 5.99.1.2) [1,2,3,4,E1] is one of the two types of enzyme that 
catalyze the interconversion of topological DNA isomers. Type II topoisomerases are ATP- 
dependent and act by passing a DNA segment through a transient double-strand break. 
Topoisomerase II is found in phages, archaebacteria, prokaryotes, eukaryotes, and in African 
Swine Fever virus (ASF). In bacteriophage T4 topoisomerase II consists of three subunits 
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(the product of genes 39, 52 and 60). In prokaryotes and in archaebacteria the enzyme, 
known as DNA gyrase, consists of two subunits (genes gyrA and gyrB [E2]). In some 
bacteria, a second type II topoisomerase has been identified; it is known as topoisomerase IV 
and is required for chromosome segregation, it also consists of two subunits (genes parC and 
parE). In eukaryotes, type II topoisomerase is a homodimer. 

There are many regions of sequence homology between the different subtypes of 
topoisomerase II. The relation between the different subunits is shown in the following 
representation: 

< About-1400-residues > 

[ Protein 39-*— -][-— Protein 52 — ] Phage T4 

[ gyrB * ][ gyrA ] Prokaryote II 

Archaebacteria 

[ parE * ][ parD ] Prokaryote IV 

[ * ] Eukaryote and 

ASF 

'*': Position of the pattern. 

As a signature pattern for this family of proteins, a region that contains a highly conserved 
pentapeptide was selected. The pattern is located in gyrB, in parE, and in protein 39 of phage 
T4 topoisomerase. 

Consensus pattern[LIVMA]-x-E-G-[DN]-S-A-x-[STAG] Sequences known to belong to this 
class detected by the pattern ALL. 

[ 1] Sternglanz R. Curr. Opin. Cell Biol. 1:533-535(1990). 

[ 2] Bjornsti M.-A. Curr. Opin. Struct. Biol. 1:99-103(1991). 

[ 3] Sharma A., Mondragon A. Curr. Opin. Struct. Biol. 5:39-47(1995). 

[ 4] Roca J. Trends Biochem. Sci. 20:156-160(1995). 
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141. (DSPc) Tyrosine specific protein phosphatases signature and profiles 
Tyrosine specific protein phosphatases (EC 3.1.3.48 ) (PTPase) [1 to 5] are enzymes that 
catalyze the removal of a phosphate group attached to a tyrosine residue. These enzymes are 
very important in the control of cell growth, proliferation, differentiation and transformation. 
Multiple forms of PTPase have been characterized and can be classified into two categories: 
soluble PTPases and transmembrane receptor proteins that contain PTPase domain(s). The 
currently known PTPases are listed below: Soluble PTPases. - PTPNl (PTP-IB). - PTPN2 
(T-cell PTPase; TC-PTP). - PTPN3 (HI) and PTPN4 (MEG), enzymes that contain an N- 
terminal band 4.1- like domain (see <PDOC00566>) and could act at junctions between the 
membrane and cytoskeleton. - PTPN5 (STEP). - PTPN6 (PTP-IC; HCP; SHP) and PTPNl 1 
(PTP-2C; SH-PTP3; Syp), enzymes which contain two copies of the SH2 domain at its N- 
terminal extremity. The Drosophila protein corkscrew (gene csw) also belongs to this 
subgroup. - PTPN7 (LC-PTP; Hematopoietic protein-tyrosine phosphatase; HePTP). - 
PTPN8 (70Z-PEP). - PTPN9 (MEG2). - PTPN12 (PTP-Gl; PTP-P19). - Yeast PTPl. - Yeast 
PTP2 which may be involved in the ubiquitin-mediated protein degradation pathway. - 
Fission yeast pypl and pyp2 which play a role in inhibiting the onset of mitosis. - Fission 
yeast pyp3 which contributes to the dephosphorylation of cdc2. - Yeast CDC14 which may 
be involved in chromosome segregation. - Yersinia virulence plasmid PTPAses (gene yopH). 

- Autographa californica nuclear polyhedrosis virus 19 Kd PTPase.Dual specificity PTPases. 

- DUSPl (PTPNIO; MAP kinase phosphatase-1; MKP-1); which dephosphorylates MAP 
kinase on both Thr-183 and Tyr-185. - DUSP2 (PAC-1), a nuclear enzyme that 
dephosphorylates MAP kinases ERKl and ERK2 on both Thr and Tyr residues. - DUSP3 
(VHR). - DUSP4 (HVH2). - DUSP5 (HVH3). - DUSP6 (Pystl; MKP-3). - DUSP7 (Pyst2; 
MKP-X). - Yeast MSGS, a PTPase that dephosphorylates MAP kinase FUS3. - Yeast YVHl. 

- Vaccinia virus HI PTPase; a dual specificity phosphatase. Receptor PTPases. Structurally, 
all known receptor PTPases, are made up of a variable length extracellular domain, followed 
by a transmembrane region and a C-terminalcatalytic cytoplasmic domain. Some of the 
receptor PTPases contain fibronectintype III (FN-lII) repeats, immunoglobulin-like domains, 
MAM domains orcarbonic anhydrase-like domains in their extracellular region. The 
cytoplasmic region generally contains two copies of the PTPAse domain. The first seems to 
have enzymatic activity, while the second is inactive but seems to affect substrate specificity 
of the first. In these domains, the catalytic cysteine is generally conserved but some other, 
presumably important, residues are not. In the following table, the domain structure of known 
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receptor PTPases is shown: Extracellular Intracellular Ig FN-3 

CAH MAM PTPaseLeukocyte common antigen (LCA) (CD45) 0 2 0 0 2Leukocyte antigen 
related (LAR) 3 8 0 0 2 Drosophila DLAR 3 9 0 0 2Drosophila DPTP 2 2 0 0 2PTP-alpha 
(LRP) 0 0 0 0 2PTP-beta 0 16 0 0 IPTP-gamma 0 110 2PTP-delta 0 >7 0 0 2 PTP-epsilon 0 
5 0 0 0 2PTP-kappa 14 0 1 2PTP-mu 14 0 1 2PTP-zeta 0 110 2PTPase domains consist of 
about 300 amino acids. There are two conserved cysteines, the second one has been shown to 
be absolutely required for activity. Furthermore, a number of conserved residues in its 
immediate vicinity have also been shown to be important. A signature pattern for PTPase 
domains was derived centered on the active site cysteine. There are three profiles for 
1 0 PTPases, the first one spans the complete domain and is not specific to any subtype. The 

second profile is specific to dual-specificity PTPases and the third one to the PTP subfamily 

Consensus pattern: [LIVMF]-H-C-x(2)-G-x(3)-[STC]-[STAGP]-x-[LIVMFY] [C is the 
active site residue] - 

15 

[ 1] Fischer E.H., Charbonneau H., Tonks N.K. Science 253:401-406(1991). 
[ 2] Charbonneau H., Tonks N.K. Annu. Rev. Cell Biol. 8:463-493(1992). 
[ 3] Trowbridge I.S. J. Biol. Chem. 266:23517-23520(1991). 
[ 4] Tonks N.K., Charbonneau H. Trends Biochem. Sci. 14:497-500(1989). 

2 0 [5] Hunter T. Cell 58:1013-1016ri989>. 

142. (DUFIO) Uncharacterized protein family UPF0076 signature 

The following uncharacterized proteins have been shown [1] to share regions of similarities: - 
25 Goat antigen UK114, a human homolog and the rat corresponding protein which is known as 
perchloric acid soluble protein (PSPl). PSPl [2] may inhibit an initiation stage of cell-free 
protein synthesis. - Mouse heat-responsive protein HRSP12. - Yeast chromosome V 
hypothetical protein YER057c. - Yeast chromosome IX hypothetical protein YILOSlc. - 
Caenorhabditis elegans hypothetical protein C23G10.2. - Escherichia coli hypothetical 

3 0 protein ycdK. - Escherichia coli hypothetical protein yhaR. - Escherichia coli hypothetical 

protein yjgF and HI0719, the corresponding Haemophilus influenzae protein. - Escherichia 
coli hypothetical protein yoaB. - Bacillus subtilis hypothetical protein yabJ. - Haemophilus 
influenzae hypothetical protein HI1627. - Helicobacter pylori hypothetical protein HP0944. - 
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Lactococcus lactis aldR. - Myxococcus xanthus dfrA. - Synechocystis strain PCC 6803 
hypothetical protein slr0709. - Rhizobium strain NGR234 symbiotic plasmid hypothetical 
protein y4sK. - Pyrococcus horikoshii hypothetical protein PH0854.These are small proteins 
of around 15 Kd whose sequence is highly conserved.As a signature pattern, a well conserved 
region located in the C-terminal part of these proteins was selected. 

Consensus pattern: [PA]-[ASTPV]-R-[SACVF]-x-[LIVMFY]-x(2)-[GSAKR]-x-[LMVA]- 
x(5,8)-[LIVM]-E-[MI]- 

[ 1] Bairoch A. Unpublished observations (1995). 

[ 2] Oka T., Tsuji H., Noda C, Sakai K., Hong Y.-M., Suzuki I., Munoz S., Natori Y. J. Biol. 
Chem. 270:30060-30067ri995). 

143. (DUF3)Domain of Unknown Function 3 

Domain apparently occurring exclusively in eubacteria. Unknown 
function. 

144. (DUF6) Integral membrane protein 

This family includes many hypothetical membrane proteins of unknown function. 
Many of the proteins contain two copies of the aligned region. 

145. (DUF7) Integral membrane protein 

This family includes many hypothetical membrane proteins of unknown function. 
Swiss:P14502 has been implicated in resistance to ethidium bromide. 

146. (DapB) Dihydrodipicolinate reductase signature 

Dihydrodipicolinate reductase (EC 1.3.1.26 ) catalyzes the second step in the biosynthesis of 
diaminopimelic acid and lysine, the NAD or NADP-dependent reduction of 2,3- 
dihydrodipicolinate into 23,4,5-tetrahydrodipicolinate. This enzyme is present in bacteria 
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(gene dapB) and higher plants. As a signature pattern the best conserved region in this 
enzyme was selected. It is located in the central section and is part of the substrate-binding 
region [1], 

Consensus pattern: E-[IV]-x-E-x-H-x(3)-K-x-D-x-P-S-G-T-A- 

[ 1] Scapin G., Blanchard J.S., Sacchettini J.C. Biochemistry 34:3502-3512(1995). 

147. DedA family 

This family combines the DedA related proteins and YIAN/YGIK family. Members 
of this family are not functionally characterised. These proteins contain multiple predicted 
transmembrane regions. 

148. DegT/DnrJ/EryCl/StrS family 

The members of this family exhibit some characteristics of the sensor protein of two- 
component signal transduction systems, however none of the members show any sequence 
similarity to these protein kinases. The members of this family do have the typical helix-turn- 
helix motif of DNA binding proteins. 

[1] Stutzman-Engwall KJ, Otten SL, Hutchinson CR, J Bacteriol 1992;174:144-154. 

149. (Desaturase) Fatty acid desaturases signatures 

Fatty acid desaturases (EC 1.14.99.-) are enzymes that catalyze the insertion of a double bond 
at the delta position of fatty acids. There seems to be two distinct families of fatty acid 
desaturases which do not seem to be evolutionary related. Family 1 is composed of: - 
Stearoyl-CoA desaturase (SCD) (EC 1.14.99.5^ [1]. SCD is a key regulatory enzyme of 
unsaturated fatty acid biosynthesis. SCD introduces a cis double bond at the delta(9) position 
of fatty acyl-CoA's such as palmitoleoyl- and oleoyl-CoA. SCD is a membrane-bound 
enzyme that is thought to function as a part of a multienzyme complex in the endoplasmic 
reticulum of vertebrates and fungi. As a signature pattern for this family a conserved region 
in the C-terminal part of these enzymes was selected, this region is rich in histidine residues 
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and in aromatic residues. Family 2 is composed of: - Plants stearoyl-acyl-carrier-protein 
desaturase (EC 1. 14.99.6 ) [2], these enzymes catalyze the introduction of a double bond at 
the delta(9) position of steraoyl-ACP to produce oleoyl-ACP. This enzyme is responsible for 
the conversion of saturated fatty acids to unsaturated fatty acids in the synthesis of vegetable 
oils. - Cyanobacteria desA [3] an enzyme that can introduce a second cis double bond at the 
delta(12) position of fatty acid bound to membranes glycerolipids. DesA is involved in 
chilling tolerance; the phase transition temperature of lipids of cellular membranes being 
dependent on the degree of unsaturation of fatty acids of the membrane lipids. As a signature 
pattern for this family a conserved region in the C-terminal part of these enzymes was 
selected. 

Consensus pattern: G-E-x-[FY]-H-N-[FY]-H-H-x-F-P-x-D-Y- 

Consensus pattern: [ST]-[SA]-x(3)-[QR]-[LI]-x(5,6)-D-Y-x(2)-[LIVMFYW]-[LIVM]- [DE]- 

[ 1] Kaestner K.H., Ntambi J.M., Kelly T.J. Jr., Lane M.D. J. Biol. Chem. 264:14755- 
14761(1989). 

[ 2] Shanklin J., Somerville C.R. Proc. Natl. Acad. Sci. U.S.A. 88:2510-2514(1991). 
[ 3] Wada H., Gombos Z., Murata N. Nature 347:200-203(1990). 

150. Dihydroorotase signatures 

Dihydroorotase (EC 3.5.2.3 ) (DHOase) catalyzes the third step in the de novo biosynthesis of 
pyrimidine, the conversion of ureidosuccinic acid (N-carbamoyl-L-aspartate) into 
dihydroorotate. Dihydroorotase binds a zinc ion which is required for its catalytic activity [1]. 
In bacteria, DHOase is a dimer of identical chains of about 400 amino-acid residues (gene 
pyrC). In higher eukaryotes, DHOase is part of a large multi-functional protein known as 
'rudimentary' in Drosophila and CAD in mammals and which catalyzes the first three steps of 
pyrimidine biosynthesis [2]. The DHOase domain is located in the central part of this 
polyprotein. In yeasts, DHOase is encoded by a monofunctional protein (gene URA4). 
However, a defective DHOase domain [3] is found in a multifunctional protein (gene 
URA2)that catalyzes the first two steps of pyrimidine biosynthesis. The comparison of 
DHOase sequences from various sources shows [4] that there are two highly conserved 
regions. The first located in the N-terminal extremity contains two histidine residues 
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suggested [3] to be involved in binding the zinc ion. The second is found in the C-terminal 
part. Signature patterns for both regions have been developed. Allantoinase (EC 3.5.2.5) is 
the enzyme that hydrolyzes allantoin intoallantoate. In yeast (gene DALl) [5], it is the first 
enzyme in the allanto indegradation pathway; in amphibians [6] and fish it catalyzes the 
second step in the degradation of uric acid. The sequence of allantoinase is evolutionary 
related to that of DHOases. 

Consensus pattern: D-[LIVMFYWSAP]-H-[LIVA]-H-[LIVF]-[RN]-x-[PGANF] [The two 

H's are probable zinc ligands]- 

Consensus pattern: [GA]-[ST]-D-x-A-P-H-x(4)-K- 

[ 1] Brown D.C., Collins K.D. J. Biol. Chem. 266:1597-1604(1991). 

[ 2] Davidson J.N., Chen K.C., Jamison R.S., Musmanno L.A., Kern C.B. BioEssays 15:157- 
164(1993). 

[ 3] Souciet J.-L., Nagy M., Le Gouar M., Lacroute F., Potier S. Gene 79:59-70(1989). 
[ 4] Guyonvarch A., Nguyen-Juilleret M., Hubert J.-C, Lacroute F. Mol. Gen. Genet. 
212:134-141(1988). 

[ 5] Buckholz R.G., Cooper T.G. Yeast 7:913-923(1991). 

[ 6] Hayashi S., Jain S., Chu R., Alvares K., Xu B., Erfurth F., Usuda N., Rao M.S., Reddy 
S.K., Noguchi T., Reddy J.K., Yeldandi A.Y. J. Biol. Chem. 269:12269-12276(1994). 

151. dnaJ domains signatures and profile 

The prokaryotic heat shock protein dnaJ interacts with the chaperone hsp70-like dnaK 
protein [1]. Structurally, the dnaJ protein consists of an N- terminal conserved domain (called 
'J' domain) of about 70 amino acids, a glycine-rich region ('G' domain') of about 30 residues, 
a central domain containing four repeats of a CXXCXGXG motif ('CRR' domain) and a C- 
terminal region of 120 to 170 residues. Such a structure is shown in the following schematic 
representation: 

+ +-+ + 4- -I- + I N-terminal | ] 

Gly-R 1 1 CXXCXGXG \ C-terminal | + +-+ +- — + + 
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It has been shown [2] that the 'J' domain as well as the 'CRR' domain are also found in 
other prokaryotic and eukaryotic proteins which are listed below. 

a) Proteins containing both a 'J' and a 'CRR' domain: 

- Yeast protein MASSA'DJl which seems to be involved in mitochondrial protein 
import. 

- Yeast protein MDJl, involved in mitochondrial biogenesis and protein folding. 

- Yeast protein SCJl, involved in protein sorting. 
Yeast protein XDJl. 

- Plants dnaJ homologs (from leek and cucumber). 
Human HDJ2, a dnaJ homolog of unknown function. 
Yeast hypothetical protein YNL077w. 

b) Proteins containing a 'J' domain without a 'CRR' domain: 

- Rhizobium fredii nolC, a protein involved in cultivar-specific nodulation of 
soybean. 

- Escherichia coli cbpA [3], a protein that binds curved DNA. 

- Yeast protein SEC63/NPL1, important for protein assembly into the endoplasmic 
reticulum and the nucleus. 

- Yeast protein SISl, required for nuclear migration during mitosis. 
Yeast protein CAJl. 

- Yeast hypothetical protein YFR041c. 

- Yeast hypothetical protein YIR004w. 

- Yeast hypothetical protein YJL162c. 

- Plasmodium falciparum ring-infected erythrocyte surface antigen (RESA). RESA, 
whose function is not known, is associated with the membrane skeleton of newly 
invaded erythrocytes. 

- Human HDJl. 

- Human HSJl, a neuronal protein. 

- Drosophila cysteine-string protein (csp). 

A signature pattern for the 'J' domain was developed, based on conserved positions in 
the C-terminal half of this domain. A pattern for the 'CRR' domain, based on the first two 
copies of that motif was also developed. A profile for the 'J' domain was also developed. 
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Consensus pattern: [FY]-x(2)-[LIVMA]-x(3)-[FYWHNT]-[DENQSA]-x-L-x-[DN]-x(3)- 
[KR]-x(2)-[FYI]- 

Consensus pattern: C-[DEGSTHKR]-x-C-x-G-x-[GK]-[AGSDM]-x(2)-[GSNKR]-x(4,6)-C- 
x(2,3)-C-x-G-x-G- 

[1] Cyr D.M., Langer T., Douglas M.G. Trends Biochem. Sci. 19:176-181(1994). 

[2] Bork P., Sander C, Valencia A., Bukau B. Trends Biochem. Sci. 17:129-129(1992). 

[3] Ueguchi C, Kaneda M., Yamada H., Mizuno T. Proc. Natl. Acad. Sci. U.S.A. 91:1054- 

1058(1994). 

152. 

153. Dwarf in 

This family known as the dwarfins also includes the drosophila protein MAD. The N- 
terminus of MAD can bind to DNA [2]. 

[1] Yingling JM, Das P, Savage C, Zhang M, Padgett RW, Wang XF, Proc Natl Acad 
Sci U S A 1996;93:8940-8944. [2] Kim J, Johnson K, Chen HJ, Carroll S, Laughon A, 
Nature 1997;388:304-308. 

154. Dynein light chain type 1 signature 

Dynein is a multisubunit microtubule-dependent motor enzyme that acts as the force 
generating protein of eukaryotic cilia and flagella. The cytoplasmic isoform of dynein acts as 
a motor for the intracellular retrograde motility of vesicles and organelles along microtubules. 
Dynein is composed of a number of ATP-binding large subunits, intermediate size subunits 
and small subunits. Among the small subunits, there is a family [1,2] of highly conserved 
proteins which consist of: - Chlamydomonas reinhardtii flagellar outer arm dynein 8 Kd and 
11 Kd light chains. - Higher eukaryotes cytoplasmic dynein light chain 1. - Yeast cytoplasmic 
dynein light chain 1 (gene DYN2 or SLCl). - Caenorhabditis elegans hypothetical dynein 
light chains M18.2 and T26A5.9.These proteins are have from 89 to 120 amino acids. As a 
signature pattern, A highly conserved region was selected. 
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Consensus pattern: H-x-I-x-G-[KR]-x-F-[GA]-S-x-V-[ST]-[HY]-E - 

[ 1] King S.M., Patel-King R.S. J. Biol. Chem. 270:11445-11452(1995). 

[ 2] Dick T., Ray K., Salz H.K., Chia W. Mol. Cell. Biol. 16:1966-1977(1996). 

155. dUTPase 

dUTPase hydrolyzes dUTP to dUMP and pyrophosphate. 

[1] Cedergren-Zeppezauer ES, Larsson G, Nyman PO, Dauter Z, Wilson KS, Nature 
1992;355:740-743. [2] Mol CD, Harris JM, Mcintosh EM, Tainer JA, Structure 
1996;4:1077-1092. 

156. (dCMP cyt deam) Cytidine and deoxycytidylate deaminases zinc-binding region 
signature 

Cytidine deaminase (EC 3.5.4.5 ) (cytidine aminohydrolase) catalyzes the hydrolysis of 
cytidine into uridine and ammonia while deoxycytidylatedeaminase (EC 3.5.4.12 ) (dCMP 
deaminase) hydrolyzes dCMP into dUMP. Both enzymes are known to bind zinc and to 
require it for their catalytic activity[l,2]. These two enzymes do not share any sequence 
similarity with the exception of a region that contains three conserved histidine and cysteine 
residues which are thought to be involved in the binding of the catalytic zincion. Such a 
region is also found in other proteins [3,4]: - Yeast cytosine deaminase (EC 3.5.4.1 ) (gene 
FCYl) which transforms cytosine into uracil. - Mammalian apolipoprotein B mRNA editing 
protein, responsible for the postranscriptional editing of a CAA codon into a UAA (stop) 
codon in the APOB mRNA. - Riboflavin biosynthesis protein ribG, which converts 2,5- 
diamino-6- (ribosylamino)-4(3H)-pyrimidinone 5'-phosphate into 5-amino-6- (ribosylamino)- 
2,4(lH,3H)-pyrimidinedione 5'-phosphate. - Bacillus cereus blasticidin-S deaminase (EC 
3.5.4.23 ), which catalyzes the deamination of the cytosine moiety of the antibiotics 
blasticidin S, cytomycin and acetylblasticidin S. - Bacillus subtilis protein comEB. This 
protein is required for the binding and uptake of transforming DNA. - Bacillus subtilis 
hypothetical protein yaaJ. - Escherichia coli hypothetical protein yfhC. - Yeast hypothetical 
protein YJL035c. A signature pattern for this zinc-binding region was derived. 
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Consensus pattern: [CH]-[AGV]-E-x(2)-[LIVMFGAT]-[LIVM]-x(17,33)-P-C-x(2,8)-C- 
x(3)-[LIVM] [The Cs and H are zinc ligands 

[ 1] Yang C, Carlow D., Wolfenden R., Short S.A. Biochemistry 31:4168-4174(1992). 
[ 2] Moore J.T., Silversmith R.E., Maley G.F., Maley F. J. Biol. Chem. 268:2288- 
2291(1993). 

[ 3] Reizer J., Buskirk S., Bairoch A., Reizer A., Saier M.H. Jr. Protein Sci. 3:853-856(1994). 
[ 4] Bhattacharya S., Navaratnam N., Morrison J.R., Scott J., Taylow W.R. Trends Biochem. 
Sci. 19:105-106(1994). 

157. Dehydrins signatures 

A number of proteins are produced by plants that experience water-stress. Water-stress takes 
place when the water available to a plant falls below a critical level. The plant hormone 
abscisic acid (ABA) appears to modulate the response of plant to water-stress. Proteins that 
are expressed during water-stress are called dehydrins [1,2] or LEA group 2 proteins [3]. The 
proteins that belong to this family are listed below. 

- Arabidopsis thaliana XERO 1, XERO 2 (LTI30), RAB18, ERDIO (LTI45) 
ERD14 and COR47. 

- Barley dehydrins B8, B9, B17, and B18. 
Cotton LEA protein D-11. 

Craterostigma plantagineum dessication-related proteins A and B. 

- Maize dehydrin M3 (RAB-17). 

- Pea dehydrins DHNl, DHN2, and DHN3. 
Radish LEA protein. 

- Rice proteins RAB 16B, 16C, 16D, RAB21, and RAB25. 

- Tomato TAS14. 

- Wheat dehydrin RAB 15 and cold-shock protein cor410, cs66 and csl20. 

Dehydrins sliare a number of structural features. One of the most notable features is the 
presence, in their central region, of a continuous run of five to nine serines followed by a cluster of 
charged residues. Such a region has been found in all known dehydrins so far with the exception of pea 
dehydrins. A second conserved feature is the presence of two copies of alysine-rich octapeptide; the 
first copy is located just after the cluster of charged residues that follows the poly-serine region and the 
second copy is found at the C-terminal extremity. Signature patterns for both regions were derived. 
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Consensus pattern: S(5)-[DE]-x-[DE]-G-x(l,2)-G-x(0,l)-[KR](4 
Consensus pattern: [KR]-[LIM]-K-[DE]-K-[LIM]-P-G- 

[1] Close T.J., Kortt A.A., Chandler P.M. Plant Mol. Biol. 13:95-108(1989). 

[2] Robertson M., Chandler P.M. Plant Mol. Biol. 19:1031-1044(1992). 

[3] Dure L. Ill, Crouch M., Harada J., Ho T.-H. D., Mundy J., Quatrano R., Thomas T., Sung 

Z.R. Plant Mol. Biol. 12:475-486(1989). 

158. (deoR) Bacterial regulatory proteins, deoR family signature 

The many bacterial transcription regulation proteins which bind DNA through a helix-turn- 
helix' motif can be classified into subfamilies on the basis of sequence similarities. One of 
these subfamilies groups the following proteins[l,2]: - accR, the Agrobacterium tumefaciens 
plasmid pTiC58 repressor of opine catabolism and conjugal transfer. - agaR, the Escherichia 
coli aga operon putative repressor. - deoR, the Escherichia coli deoxyribose operon repressor. 
- fucR, the Escherichia coli L-fucose operon activator. - gatR, the Escherichia coli galactitol 
operon repressor. - glpR, the Escherichia coli glycerol-3-phosphate regulon repressor. - gutR 
(or srlR), the Escherichia coli glucitol operon repressor. - iolR, from Bacillus subtilis. - lacR, 
the streptococci lactose phosphotransferase system repressor. - spoIIID, the Bacillus subtilis 
transcription regulator of the sigK gene. - yfjR, an Escherichia coli hypothetical protein. - 
ygbl, an Escherichia coli hypothetical protein. - yihW, an Escherichia coli hypothetical 
protein. - yjfQ, an Escherichia coli hypothetical protein. - yjhJ, an Escherichia coli 
hypothetical protein. The 'helix-turn-helix' DNA-binding motif of these proteins is located in 
the N-terminal part of the sequence. The pattern used to detect these proteins starts fourteen 
residues before the HTH motif and ends one residue after it. 

Consensus pattern: R-x(3)-[LIVM]-x(3)-[LIVM]-x(16,17)-[STA]-x(2)-T-[LIVMA]- [RH]- 
[KRNA]-D-[LIVMF]- 

[ 1] von Bodman S., Hayman G.T., Farrand S.K. Proc. Natl. Acad. Sci. U.S.A. 89:643- 
647(1992). 

[ 2] Bairoch A. Unpublished observations (1993). 
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159. dsrm 

Double-stranded RNA binding motif 

5 

[1] Burd CG, Dreyfuss G; Medline: 94310455, Conserved structures and diversity of 
functions of RNA-binding proteins. Science 1994;265:615-621. 

Sequences gathered for seed by HMM_iterative_training Putative motif shared by proteins 
1 0 that bind to dsRNA. At least some DSRM proteins seem to bind to specific RNA targets. 

Exemplified by Staufen, which is involved in localization of at least five different mRNAs in 
the early Drosophila embryo. Also by interferon-induced protein kinase in humans, which is 
part of the cellular response to dsRNA. 

1 5 Number of members: 116 

160. Dynamin family signature 

Dynamin [1,2] is a microtubule-associated force-producing protein of 100 Kd which is 
2 0 involved in the production of microtubule bundles and which is able to bind and hydrolyze 
GTP. Dynamin is structurally related to the following proteins: - Drosophila shibire protein 
(gene shi) [3]. Shibire is, very probably, the Drosophila cognate of mammalian dynamin. It 
seems to provide the motor for vesicular transport during endocytosis. - Yeast vacuolar 
sorting protein VPSl (or SP015) [4], a protein which could also be involved in microtubule- 

2 5 associated motility. - Yeast protein MGMl [5], which is required for mitochondrial genome 

maintenance. - Yeast protein DNMl, which is involved in endocytosis. - Interferon induced 
Mx proteins [6,7]. Interferon alpha or beta induce the synthesis of a family of closely related 
proteins. Most of these proteins are known to confer resistance to influenza viruses and/or 
rhabdoviruses on transfected mammalian cell in culture. The three motifs found in all GTP- 

3 0 binding proteins are located in the N-terminal part of these proteins. The signature pattern 

that was developed for these proteins is based on a highly conserved region downstream of 
the ATP/GTP-binding motif A' (P-loop) (see < PDQC00017 >).- 
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Consensus pattern: L-P-[RK]-G-[STN]-[GN]-[LIVM]-V-T-R- 

[ 1] Vallee R.B., Shpetner H.S. Annu. Rev. Biochem. 59:909-932(1990). 
[ 2] Obar R.A., Collins C.A., Hammarback J.A., Shpetner H.S., Vallee R.B. Nature 347:256- 
5 261(1990). 

[ 3] van der Bliek A., Meyerowitz E.M. Nature 351:411-414(1991). 

[ 4] Rothman J.H., Raymond C.K., Gilbert T., O'Hara P.J., Stevens T.H. Cell 61:1063- 

1074(1990). 

[ 5] Jones B.A., Fangman W.L. Genes Dev. 6:380-389(1992). 
10 [6] Arnheiter H., Meier E. New Biol. 2:851-857(1990). 

[ 7] Staeheli P., Pitossi R, Pavlovic J. Trends Cell Biol. 3:268-272(1993). 

161. (dynamin_2) Dynamin central region 

1 5 This region lies between the GTPase domain, see dynamin , and the pleckstrin 

homology (PH) domain. 

162. E1-E2 ATPases phosphorylation site 

2 0 E1-E2 ATPases (also known as P-type) are cation transport ATPases which form an aspartyl 
phosphate intermediate in the course of ATP hydrolysis. ATPases which belong to this family 
are listed below [1,2,3]. - Fungal and plant plasma membrane (H-h) ATPases [reviewed in 4]. 
- Vertebrate (Na-i-, K+) ATPases (sodium pump) [reviewed in 5,6]. - Gastric (K+, H+) 
ATPases (proton pump). - Calcium (Ca++) ATPases (calcium pump) from the sarcoplasmic 

2 5 reticulum (SR), the endoplasmic reticulum (ER) and the plasma membrane. - Copper (Cu++) 
ATPases (copper pump) which are involved in two human genetic disorders: Menkes 
syndrome and Wilson disease [7]. - Bacterial potassium (K+) ATPases. - Bacterial cadmium 
efflux (Cd++) ATPases [reviewed in 8]. - Bacterial magnesium (Mg+-H) ATPases. - A 
probable cation ATPase from Leishmania. - fixl, a probable cation ATPase from Rhizobium 

30 meliloti, involved in nitrogen fixation. The region around the phosphorylated aspartate 

residue is perfectly conserved in all these ATPases and can be used as a signature pattern. 

Consensus pattern: D-K-T-G-T-[LI]-[TI] [D is phosphorylated] 
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[ 1] Green N.M., McLennan D.H. Biochem. Soc. Trans. 17:819-822(1989). 

[ 2] Green N.M. Biochem. Soc. Trans. 17:970-972(1989). 

[ 3] Fagan MJ., Saier M.H. Jr. J. Mol. Evol. 38:57-99(1994). 

[ 4] Serrano R. Biochim. Biophys. Acta 947:1-28(1988). 

[ 5] Fambrough D.M. Trends Neurosci. 11:325-328(1988). 

[ 6] Sweadner K.J. Biochim. Biophys. Acta 988:185-220(1989). 

[ 7] Bull P.C., Cox D.W. Trends Genet. 10:246-251(1994). 

[ 8] Silver S., Nucifora G., Chu L., Misra T.K. Trends Biochem. Sci. 14:76-80(1989). 

163. E1_N 

El Protein, N terminal domain 
Number of members: 90 

164. (El_dehydrog) Dehydrogenase El component 

This family uses thiamine pyrophosphate as a cofactor. This family includes pyruvate 
dehydrogenase, 2-oxoglutarate dehydrogenase and 2-oxoisovalerate dehydrogenase. 

165. (ECH) Enoyl-CoA hydratase/isomerase signature 

Enoyl-CoA hydratase (EC 4.2.1.11 ) (ECH) [1] and 3-2trans-enoyl-CoA isomerase(EC 
5.3.3.8 ) (ECI) [2] are two enzymes involved in fatty acid metabolism. ECH catalyzes the 
hydratation of 2-trans-enoyl-CoA into 3-hydroxyacyl-CoA and ECI shifts the 3- double bond 
of the intermediates of unsaturated fatty acid oxidation to the 2-trans position. Most 
eukaryotic cells have two fatty-acid beta-oxidation systems, one located in mitochondria and 
the other in peroxisomes. In mitochondria, ECH and ECI are separate yet structurally related 
monofunctional enzymes. Peroxisomes contain a trifunctional enzyme [3] consisting of an N- 
terminal domain that bears both ECH and ECI activity, and a C-terminal domain responsible 
for 3-hydroxyacyl-CoA dehydrogenase (HCDH) activity. In Escherichia coli (gene fadB) and 
Pseudomonas fragi (gene faoA), ECH and ECI are also part of a multifunctional enzyme 
which contains both a HCDH and a3-hydroxybutyryl-CoA epimerase domain [4]. A number 
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of other proteins have been found to be evolutionary related to the ECH/ECI enzymes or 
domains: - 3-hydroxbutyryl-coa dehydratase (EC 4.2.1.55 ) (crotonase), a bacterial enzyme 
involved in the butyrate/butanol-producing pathway. - Naphthoate synthase (EC 4.1.3.36 ) 
(DHNA synthetase) (gene menB) [5], a bacterial enzyme involved in the biosynthesis of 
menaquinone (vitamin K2). DHNA synthetase converts O-succinyl-benzoyl-CoA (OSB- 
CoA) to 1,4-dihydroxy- 2-naphthoic acid (DHNA). - 4-chlorobenzoate dehalogenase (EC 
3.8.1.6 ) [6], a Pseudomonas enzyme which catalyzes the conversion of 4-chlorobenzoate- 
CoA to 4-hydroxybenzoate-CoA. - A Rhodobacter capsulatus protein of unknown function 
(ORF257) [7]. - Bacillus subtilis putative polyketide biosynthesis proteins pksH and pksl. - 
Escherichia coli carnitine racemase (gene caiD) [8]. - Escherichia coli hypothetical protein 
ygfG. - Yeast hypothetical protein YDR036c.As a signature pattern for these enzymes, a 
conserved region richin glycine and hydrophobic residues was selected. 

Consensus pattern: [LIVM]-[STA]-x-[LIVM]-[DENQRHSTA]-G-x(3)-[AG](3)-x(4)- 
[LIVMST]-x-[CSTA]-[DQHP]-[LIVMFY]- 

[ 1] Minami-Ishii N., Taketani S., Osumi T., Hashimoto T. Eur. J. Biochem. 185:73- 
78(1989). 

[ 2] Mueller-Newen G., Stoffel W. Biol. Chem. Hoppe-Seyler 372:613-624(1991). 
[ 3] Palosaari P.M., Hiltunen J.K. J. Biol. Chem. 265:2446-2449(1990). 
[ 4] Nakahigashi K., Inokuchi H. Nucleic Acids Res. 18:4937-4937(1990). 
[ 5] Driscoll J.R., Taber H.W. J. Bacteriol. 174:5063-5071(1992). 

[ 6] Babbitt P.C., Kenyon G.L., Matin B.M., Charest H., Sylvestre M., Scholten J.D., Chang 
K.-H., Liang P.-H., Dunaway-Mariano D. Biochemistry 31:5594-5604(1992). 
[ 7] Beckman D.L., Kranz R.G. Gene 107:171-172(1991). 

[ 8] Eichler K., Bourgis F., Buchet A., Kleber H.-P., Mandrand-Berthelot M.-A. Mol. 
Microbiol. 13:775-786(1994). 

166. (EFIBD) Elongation factor 1 beta/beta'/delta chain signatures 
Eukaryotic elongation factor 1 (EF-1) is responsible for the GTP-dependent binding of 
aminoacyl-tRNAs to the ribosomes [1]. EF-1 is composed of four subunits: the alpha chain 
which binds GTP and aminoacyl-tRNAs, the gamma chain that probably plays a role in 
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anchoring the complex to other cellular components and the beta and delta (or beta') chains. 
The beta and delta chains are highly similar proteins that both stimulate the exchange of GDP 
bound to the alpha chain for GTP [2]. The beta and delta chains are hydrophilic proteins of 
around 23 to 31 Kd. Their C-terminal part seems important for the nucleotide exchange 
activity, while the N-terminal section is probably involved in the interaction with the gamma 
chain. Two signature patterns for this family of proteins were developed. The first 
corresponds to an acidic region in the central section; the second, to the C-terminal extremity 
of these proteins 

Consensus pattern: [DE]-[DEG]-[DE](2)-[LIVMF]-D-L-F-G- 
Consensus pattern: [IV]-Q-S-x-D-[LIVM]-x-A-[FWM]-[NQ]-K-[LIVM]- 

[ 1] Riis B., Rattan I.S., Clark B.F.C., Merrick W.C. Trends Biochem. Sci. 15:420-424(1990). 
[ 2] van Damme H.T.F., Amons R., Karssies R., Timmers C.J., Janssen G.M.C., Moeller W. 
Biochim. Biophys. Acta 1050:241-247(1990). 

167. (EFlG_domain) Elongation factor 1 gamma, conserved domain 

168. (EFG_C) Elongation factor G C-terminus 

This family is always found associated with GTP EFTU . This family includes the 
carboxyl terminal regions of Elongation factor G, elongation factor 2 and some tetracycline 
resistance proteins. 

169. (EFP) Elongation factor P signature 

Elongation factor P (EF-P) [1] is a prokaryotic protein translation factor required for efficient 
peptide bond synthesis on 70S ribosomes from fMet-tRNAfMet. EF-P is a protein of 21 Kd. 
It is evolutionary related to yeiP, an hypothetical protein from Escherichia coli. As a 
signature pattern, a conserved region located in the C-terminal part of these proteins was 
selected. 
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Consensus pattern: K-x-[AV]-x(4)-G-x(2)-[LIV]-x-V-P-x(2)-[LIV]-x(2)-G- 

[ 1] Aoki H., Adams S.-L., Turner M.A., Ganoza M.C. Biochimie 79:7-11(1997). 

5 

170. (EF TS) Elongation factor Ts signatures 

In prokaryotes elongation factor Ts (EF-Ts) is a component of the elongation cycle of protein 
biosynthesis. It associates with the BF-Tu.GDP complex and induces the exchange of GDP to 
GTP, it remains bound to the aminoacyl-tRNA.EF-Tu.GTP complex up to the GTP 
1 0 hydrolysis stage on the ribosome [1].EF-Ts is also a component of the chloroplast protein 

biosynthetic machinery and is encoded in the genome of some algal chloroplast [2]. It is also 
present in mitochondria [3]. As signature patterns for EF-Ts, two conserved regions located 
in the N-terminal part of the protein have been selected. 

1 5 Consensus pattern: L-R-x(2)-T-[GSDNQ]-x-[GS]-[LIVMF]-x(0,l)-[DENKAC]-x-K- 
[KRNEQS]-A-L- 

Consensus pattern: E-[LIVM]-[NV]-[SCV]-[QE]-T-D-F-V-[SA]-[KRN]- 

[ 1] Bubunenko M.G., Kireeva M.L., Gudkov A.T. Biochimie 74:419-425(1992). 
2 0 [2] Kostrzewa M., Zetsche K. Plant Mol. Biol. 23:67-76(1993). 

[ 3] Xin H., Woriax V.L., Burkhart W.A., Spremulli L.L. J. Biol. Chem. 270:17243- 
17249(1995). 

25 171. (EMP24_GP25L) emp24/gp25 L/p24 family 

Members of this family are implicated in bringing cargo forward from the ER and 
binding to coat proteins by their cytoplasmic domains. Number of members: 30 

Paccaud JP, Thomas DY, Bergeron JJ, Nilsson T, J Cell Biol 1998;140:751-765. 

30 

172. ENV_polyprotein 

ENV polyprotein (coat polyprotein) 

Number of members: 224 
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173. (ERG4_ERG24) Ergosterol biosynthesis ERG4/ERG24 family signatures 

Two fungal enzymes involved in ergosterol biosynthesis and which act by reducing double 
bonds in precursors of ergosterol have been shown to be evolutionary related [1]. These are 
C-14 sterol reductase (gene ERG24 in budding yeast and erg3 in Neurospora Crassa) and C- 
24(28) sterol reductase (gene ERG4 in budding yeast and stsl in fission yeast). Their 
sequences are also highly related to that of chicken lamin B receptor, which is thought to 
anchor the lamina to the inner nuclear membrane. These proteins are highly hydrophobic and 
seem to contain seven or eight transmembrane regions. As signature patterns, two conserved 
regions were selected. The first one is apparently located in a loop between the fourth and 
fifth transmembrane regions and the second is in the C-terminal section. 

Consensus pattern: G-x(2)-[LIVM]-[YH]-D-x-[FYW]-x-G-x(2)-L-N-P-R - 
Consensus pattern: [LIVM](2)-H-R-x(2)-R-D-x(3)-C-x(2)-K-Y-G- 

[ 1] Lai M.H., Bard M., Pierson C.A., Alexander J.F., Goebl M., Carter G.T., Kirsch D.R. 
Gene 140:41-49(1994). 

174. (ERM) Ezrin/radixin/moesin family 

This family of proteins contain a band 4.1 domain (Band__41), at their amino terminus. 
This family represents the rest of these proteins. 

[1] Yonemura S, Hirao M, Doi Y, Takahashi N, Kondo T, Tsukita S, J Cell Biol 
1998;140:885-895. 

175. ER lumen protein retaining receptor signatures 

Proteins that reside in the lumen of the endoplasmic reticulum (ER) contain aC-terminal 
tetrapeptide (generally K-D-E-L or H-D-E-L) that serves as a signal for their retrieval 
(retrograde transport) from subsequent compartments of the secretory pathway. The signal is 
recognized by a receptor molecule that is believed to cycle between the cis side of the Golgi 
apparatus and the ER [l].This protein is known as the ER lumen protein retaining receptor or 
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also as the 'KDEL receptor'. It has been characterized in a variety of species, including fungi 
(gene ERD2), plants, Plasmodium, Drosophila and mammals. In mammals two highly related 
forms of the receptor are known. Structurally, the receptor is a protein of about 220 residues 
that seems to contain seven transmembrane regions [2]. The N-terminal part (3 residues) is 
oriented toward the lumen while the C-terminal tail (about 12 residues) is cytoplasmic. There 
are three lumenal and three cytoplasmic loops. Two signature patterns for these receptors 
were developed. The first pattern corresponds to the C-terminal half of the first cytoplasmic 
loop as well as most of the second transmembrane domain. The second pattern is a perfectly 
conserved decapeptide that corresponds to the central part of the fifth transmembrane 
domain. 

Consensus pattern: G-I-S-x-[KR]-x-Q-x-L-[FY]-x-[LIV](2)-F-x(2)-R-Y- 
Consensus pattern: L-E-[SA]-V-A-I-[LM]-P-Q-L- 

[ 1] Pelham H.R.B. Curr. Opin. Cell Biol. 3:585-591(1991). 

[ 2] Townsley P.M., Wilson D.W., Pelham H.R.B. EMBO J. 12:2821-2829(1993). 

176. (ETF_beta) Electron transfer flavoprotein beta-subunit signature 
The electron transfer flavoprotein (ETF) [1,2] serves as a specific electron acceptor for 
various mitochondrial dehydrogenases. ETF transfers electrons to the main respiratory chain 
via ETF-ubiquinone oxidoreductase. ETF is an heterodimer that consist of an alpha and a 
beta subunit and which bind one molecule of FAD per dimer. A similar system also exists in 
some bacteria. The beta subunit of ETF is a protein of about 28 Kd which is structurally 
related to the bacterial nitrogen fixation protein fixA which could play a role in a redox 
process and feed electrons to ferredoxin. Other related proteins are: - Escherichia coli 
hypothetical protein ydiQ. - Escherichia coli hypothetical protein ygcR.As a signature pattern 
for these proteins, a conserved region which is located in the central section was selected. 

Consensus pattern: [IVA]-x-[KR]-x(2)-[DE]-[GD]-[GDE]-x(l,2)-[EQ]-x-[LIV]- x(4)-P-x- 
[LIVM](2)-[TAC]- 

[ 1] Finocchiaro G., Ikeda Y., Ito M., Tanaka K. Prog. Clin. Biol. Res. 321:637-652(1990). 
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[ 2] Tsai M.H., Saier M.H. Jr. Res. Microbiol. 146:397-404(1995). 
177. Endonuclease III signatures 

Escherichia coli endonuclease III (EC 4.2.99.18 ) (gene nth) [1] is a DNA repair enzyme that 
acts both as a DNA N-glycosylase, removing oxidized pyrimidines from DNA, and as an 
apurinic/apyrimidinic (AP) endonuclease, introducing a single-strand nick at the site from 
which the damaged base was removed. Endonuclease III is an iron-sulfur protein that binds a 
single 4Fe-4Scluster. The 4Fe-4S cluster does not seem to be important for catalytic activity, 
but is probably involved in the proper positioning of the enzyme along the DNA strand 
[2] .Endonuclease III is evolutionary related to the following proteins: - Fission yeast 
endonuclease III homolog (gene nthl) [3]. - Escherichia coli and related protein DNA repair 
protein mutY, which is an adenine glycosylase. MutY is a larger protein (350 amino acids) 
than endonuclease III (211 amino acids). - Micrococcus luteus ultraviolet N-glycosylase/AP 
lyase which initiates repair at cis-syn pyrimidine dimers. - ORFIO in plasmid pFVl of the 
thermophilic archaebacteria Methanobacterium thermoformicicum [4]. Restriction methylase 
m.MthTI, which is encoded by this plasmid, generates 5-methylcytosine which is subject to 
deamination resulting in G-T mismatches. This protein could correct these mismatches. - 
Yeast hypothetical protein YAL015c. - Fission yeast hypothetical protein SpAC26A3.02. - 
Caenorhabditis elegans hypothetical protein R10E4.5. - Methanococcus jannaschii 
hypothetical protein MJ0613.The 4Fe-4S cluster is bound by four cysteines which are all 
located in a 17amino acid region at the C-terminal end of endonuclease III. A similar region 
is also present in the central section of mutY and in the C-terminus of ORFlOand of the 
Micrococcus UV endonuclease. The 4Fe-4S cluster region does not exist in YALOlSc. Two 
signature patterns for these proteins were developed: the first corresponds to the core of the 
iron-sulfur binding domain, the second corresponds to the best conserved region in the 
catalytic core of these enzymes. 

Consensus pattern: C-x(3)-[KRS]-P-[KRAGL]-C-x(2)-C-x(5)-C [The four Cs are 4Fe-4S 
ligands]- 

Consensus pattern: [GST]-x-[LIVMF]-P-x(5)-[LIVMW]-x(2,3)-[LI]-[PAS]-G-V-[GA]- x(3)- 
[GAC]-x(3)-[LIVM]-x(2)-[SALV]-[LIVMFYW]-[GANK]- 
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[ 1] Kuo C.-F., McRee D., Fisher C.L., O'Handley S.F., Cunnigham R.P., Tainer J.A. Science 
258:434-440(1992). 

[ 2] Thomson A.J. Curr. Biol. 3:173-174(1993). 

[ 3] Roldan-Arjona T., Anselmino C, Lindahl T. Nucleic Acids. Res. 3307-3312(1996). 
[ 4] Noelling J., van Eeden F.J.M., Eggen R.I.L., de Vos W.M. Nucleic Acids Res. 20:6501- 
6507(1992). 

178. (Epimerase) NAD dependent epimerase/dehydratase family 

This family of proteins utilize NAD as a cofactor. The proteins in this family use 
nucleotide-sugar substrates for a variety of chemical reactions. 

[1] Thoden JB, Hegeman AD, Wesenberg G, Chapeau MC, Frey PA, Holden HM, 
Biochemistry 1997;36:6294-6304. 

179. Exonuclease 

This family includes a variety of exonuclease proteins, such as ribonuclease T and the 
epsilon subunit of DNA polymerase III. 

[1] Koonin EV, Deutscher MP, Nucleic Acids Res 1993;21:2521-2522. 

180. ENTH 
ENTH domain 

[1] Kay BK, Yamabhai M, Wendland B, Emr SD; Medline: 99156083, Identification of a 
novel domain shared by putative components of the endocytic and cytoskeletal machinery. 
Protein Sci 1999;8:435-438. 

The ENTH (Epsin N-terminal homology) domain is found in proteins involved in endocytosis 
and cytoskeletal machinery. The function of the ENTH domain is unknown. 



Number of members: 29 
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181. (elF-lA) Eukaryotic initiation factor lA signature 

Eukaryotic translation initiation factor lA (eIF-1 A) [1] (formerly known aseiF-4C) is a 
protein that seems to be required for maximal rate of protein biosynthesis. It enhances 
ribosome dissociation into subunits and stabilizesthe binding of the initiator Met-tRNA to 
40S ribosomal subunits.elF-lA is a hydrophilic protein of about 15 to 17 Kd. Archaebacteria 
also seem to possess a elF-lA homolog. As a signature pattern, a conserved region in the 
central section of these proteins was selected. 

Consensus pattern: [IM]-x-G-x-[GS]-[KRH]-x(4)-[CL]-x-D-G-x(2)-R-x(2)-[RH]-I- x-G 
[ 1] Wei C.-L., Kainuma M., Hershey J.W.B. J. Biol. Chem. 270:22788-22794(1995). 

182. (eIF-5A) Eukaryotic initiation factor 5 A hypusine signature 

Eukaryotic initiation factor 5A (eIF-5A) (formerly known as elF-4D) [1,2] is a small protein 
whose precise role in the initiation of protein synthesis is not known. It appears to promote 
the formation of the first peptide bond. elF-5Aseems to be the only eukaryotic protein to 
contain an hypusine residue. Hypusine is derived from lysine by the post-translational 
addition of a butylamino group (from spermidine) to the epsilon-amino group of lysine. The 
hypusine group is essential to the function of eIF-5A.A hypusine-containing protein has been 
found in archaebacteria such as Sulfolobus acidocaldarius or Methanococcus jannaschii; this 
protein is highlysimilar to eIF-5A and could play a similar role in protein biosynthesis. The 
signature developed for eIF-5A is centered around the hypusine residue. 

Consensus pattern: [PT]-G-K-H-G-x-A-K [The first K is modified to hypusine] 

[ 1] Park M.H., Wolff E.G., Folk J.E. Biofactors 4:95-104(1993). 

[ 2] Schnier J., Schwelberger H.G., Smit-McBride Z., Kang H.A., Hershey J.W.B. Mol. Cell. 
Biol. 11:3105-3114(1991). 

183. (efhand) S-lOO/ICaBP type calcium binding protein signature 
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S-100 are small dimeric acidic calcium and zinc-binding proteins [1] abundant in the brain. 
They have two different types of calcium-binding sites: a low affinity one with a special 
structure and a 'normal' EF-hand type high affinity site. The vitamin-D dependent intestinal 
calcium-binding proteins (ICaBP or calbindin 9 Kd) also belong to this family of proteins, 
5 but it does not form dimers. In the past years the sequences of many new members of this 
family have been determined (for reviews see [2,3,4]); in most cases the function of these 
proteins is not yet known, although it is becoming clearthat they are involved in cell growth 
and differentiation, cell cycle regulation and metabolic control. These proteins are: - 
Calcyclin (Prolactin receptor associated protein (PRA); clatropin; 2a9; 5B10; S100A6). - 

1 0 Calpactin I light chain (plO; pll; 42c; SIOOAIO). - Calgranulin A (cystic fibrosis antigen 

(CFAg); MIF related protein 8 (MRP- 8); p8; S100A8). - Calgranulin B (MIF related protein 
14 (MRP-14); pl4; S100A9). - Calgranulin C. - Calgizzarin (SIOOC). - Placental calcium- 
binding protein (CAPL) (18a2; peL98; 42a; p9K; MTSl; metastatin; S100A4). - Protein S- 
lOOD (S100A5). - Protein S-IOOE (S100A3). - Protein S-IOOL (CAN19; S100A2). - 

1 5 Placental protein S-IOOP (SIOOE). - Psoriasin (S100A7). - Chemotactic cytokine CP-10 [5]. - 
Protein MRP-126 [6]. - Trichohyalin [7]. This is a large intermediate filament-associated 
protein that associates with keratin intermediate filaments (KIF); it contains a S- 100 type 
domain in its N-terminal extremity. A number of these proteins are known to bind calcium 
while others are not (plOfor example). Our EF-hand detecting pattern will fail to pick those 

2 0 proteins which have lost their calcium-binding properties. A pattern was developed which 

unambiguously picks up proteins belonging to this family. This pattern spans the region of 
the EF-hand high affinity site but makes no assumptions on the calcium-binding properties of 
this site. 

25 Consensus pattern: [LIVMFYW](2)-x(2)-[LK]-D-x(3)-[DN]-x(3)-[DNSG]-[FY]-x- [ES]- 
[FYVC]-x(2)-[LIVMFS]-[LIVMF] 

[ 1] Baudier J. (In) Calcium and Calcium Binding proteins, Gerday C, Bollis L., Ciller R., 
Eds., ppl02-113, Springer Verlag, Berlin, (1988). 

3 0 [2] Moncrief N.D., Kretsinger R.H., Goodman M. J. Moi. Evol. 30:522-562(1990). 

[ 3] Kligman D., Hilt D.C. Trends Biochem. Sci. 13:437-443(1988). 

[ 4] Schaefer B.W., Wicki R., Engelkamp D., Mattel M.-G., Heizmann C.W. Genomics 

25:638-643(1995). 
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[ 5] Lackmann M., Cornish C.J., Simpson R.J., Moritz R.L., Geczy C.L. J. Biol. Chem. 
267:7499-7504(1992). 

[ 6] Nakano T., Graf T. Oncogene 7:527-534(1992). 

[ 7] Lee S.-C, Kim I.-G., Marekov L.N., O'Keefe E.J., Parry D.A.D., Steinert P.M., J. Biol. 
5 Chem. 268:12164-12176(1993). 

EF-hand calcium-binding domain 

Many calcium-binding proteins belong to the same evolutionary family and share 
a type of calcium-binding domain known as the EF-hand [1 to 5]. This type of 

1 0 domain consists of a twelve residue loop flanked on both side by a twelve 
residue alpha-helical domain. In an EF-hand loop the calcium ion is 
coordinated in a pentagonal bipyramidal configuration. The six residues 
involved in the binding are in positions 1, 3, 5, 7, 9 and 12; these residues 
are denoted by X, Y, Z, -Y, -X and -Z. The invariant Glu or Asp at position 12 

1 5 provides two oxygens for liganding Ca (bidentate ligand). 

Listed below are the proteins which are known to contain EF-hand regions. For 
each type of protein the total number of EF-hand regions known or supposed to exist 
is indicated between parenthesis. This number does not include 
regions which clearly have lost their calcium-binding properties, or the 

20 atypical low-affinity site (which spans thirteen residues) found in the S-100/ 
ICaBP family of proteins [6]. 

- Aequorin and Renilla luciferin binding protein (LBP) (Ca=3). 

- Alpha actinin (Ca=2). - Calbindin (Ca=4). 

- Calcineurin B subunit (protein phosphatase 2B regulatory subunit) (Ca=4). 

2 5 - Calcium-binding protein from Strep tomyces erythraeus (Ca=3?). 

- Calcium-binding protein from Schistosoma mansoni (Ca=2?). 

- Calcium-binding proteins TCBP-23 and TCBP-25 from Tetrahymena thermophila 
(Ca=4?). - Calcium-dependent protein kinases (CDPK) from plants (Ca=4). 

- Calcium vector protein from amphoxius (Ca=2). 

3 0 - Calcyphosin (thyroid protein p24) (Ca=4?). 

- Calmodulin (Ca=4, except in yeast where Ca=3). 

- Calpain small and large chains (Ca=2). - Calretinin (Ca=6). 

- Calcyclin (prolactin receptor associated protein) (Ca=2). 



Reference No. 2750-942P 

211 

- Caltractin (centrin) (Ca=2 or 4). 

- Cell Division Control protein 31 (gene CDC31) from yeast (Ca=2?). 

- Diacylglycerol kinase (EC 2.7.1.107) (DGK) (Ca=2). 

- FAD-dependent glycerol-3-phosphate dehydrogenase (EC 1.1.99.5) from 
5 mammals (Ca=l). - Fimbrin (plastin) (Ca=2). 

- Flagellar calcium-binding protein (118) from Trypanosoma cruzi (Ca=l or 2). 

- Guanylate cyclase activating protein (GCAP) (Ca=3). 

- Inositol phospholipid-specific phospholipase C isozymes gamma-1 and delta-1 
(Ca=2) [10]. - Intestinal calcium-binding protein (ICaBPs) (Ca=2). 

1 0 - MIF related proteins 8 (MRP-8 or CFAG) and 14 (MRP-14) (Ca=2). 

- Myosin regulatory light chains (Ca=l). - Oncomodulin (Ca=2). 

- Osteonectin (basement membrane protein BM-40) (SPARC) and proteins that 
contains an 'osteonectin' domain (QRl, matrix glycoprotein SCI) (see the 
entry <PDOC00535>) (Ca=l). - Parvalbumins alpha and beta (Ca=2). 

15 -Placental calcium-binding protein (18a2) (nerve growth factor induced 
protein 42a) (p9k) (Ca=2). 

- Recoverins (visinin, hippocalcin, neurocalcin, S-modulin) (Ca=2 to 3). 

- Reticulocalbin (Ca=4). - S-100 protein, alpha and beta chains (Ca=2). 

- Sarcoplasmic calcium-binding protein (SCPs) (Ca=2 to 3). 

2 0 - Sea urchin proteins Spec 1 (Ca=4), Spec 2 (Ca=4?), Lps-1 (Ca=8). 

- Serine/threonine protein phosphatase rdgc (EC 3.1.3.16) from Drosophila 
(Ca=2) - Sorcin V19 from hamster (Ca=2). - Spectrin alpha chain (Ca=2). 

- Squidulin (optic lobe calcium-binding protein) from squid (Ca=4). 

- Troponins C; from skeletal muscle (Ca=4), from cardiac muscle (Ca=3), from 

2 5 arthropods and molluscs (Ca=2). 

There has been a number of attempts [7,8] to develop patterns that pick-up EF- 
hand regions, but these studies were made a few years ago when not so many 
different families of calcium-binding proteins were known. Therefore 
a new pattern was developed which takes into account all published sequences. This 

3 0 pattern includes the complete EF-hand loop as well as the first residue which 

follows the loop and which seem to always be hydrophobic. 
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-Consensus pattern: D-x-[DNS]-{ILVFYW}-[DENSTG]-[DNQGHRK]-{GP}-[LIVMC]- 
[DENQSTAGC]-x(2)-[DE]-[LIVMFYW] 

-Note: positions 1 (X), 3 (Y) and 12 (-Z) are the most conserved. 
-Note: the 6th residue in an EF-hand loop is, in most cases a Gly, but the number of 
5 exceptions to this 'rule' has gradually increased and therefore the pattern should include all 
the different residues which have been shown to exist in this position in functional Ca- 
binding sites. 

-Note: the pattern will, in some cases, miss one of the EF-hand regions in some proteins 
with multiple EF-hand domains. 

10 

[ 1] Kawasaki H., Kretsinger R.H. Protein Prof. 2:305-490(1995).[ 2] Kretsinger R.H. Cold 
Spring Harbor Symp. Quant. Biol. 52:499-510(1987). 

[ 3] Moncrief N.D., Kretsinger R.H., Goodman M. J. Mol. Evol. 30:522-562(1990). 

[ 4] Nakayama S., Moncrief N.D., Kretsinger R.H. J. Mol. Evol. 34:416-448(1992). 
15 [5] Heizmann C.W., Hunziker W. Trends Biochem. Sci. 16:98-103(1991). 

[ 6] Kligman D., Hilt D.C. Trends Biochem. Sci. 13:437-443(1988). 

[ 7] Strynadka N.C.J., James M.N.G. 
Annu. Rev. Biochem. 58:951-98(1989). 

[ 8] Haiech J., Sallantin J. Biochimie 67:555-560(1985). 
2 0 [9] Chauvaux S., Beguin P., Aubert J.-P., Bhat KM., Gow L.A., Wood T.M., Bairoch A. 

Biochem. J. 265:261-265(1990). 

[10] Bairoch A., Cox J.A. FEES Lett. 269:454-456(1990). 

2 5 184. Enolase signature 

Enolase (EC 4.2.1.11 ) is a glycolytic enzyme that catalyzes the dehydration of2-phospho-D- 
glycerate to phosphoenolpyruvate [1]. It is a dimeric enzyme that requires magnesium both 
for catalysis and stabilizing the dimer. Enolase is probably found in all organisms that 
metabolize sugars. In vertebrates, there are three different tissue-specific isozymes: alpha 

3 0 present in most tissues, beta in muscles and gamma found only in nervous tissues. Tau- 

crystallin, one of the major lens proteins in some fish, reptiles and birds, has been shown [2] 
to be evolutionary related to enolase. As a signature pattern for enolase, the best conserved 
region was selected, it is located in the C-terminal third of the sequence.- 
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Consensus pattern: [LIV](3)-K-x-N-Q-I-G-[ST]-[LIV]-[ST]-[DE]-[STA] 
[ 1] Lebioda L., Stec B., Brewer J.M. J. Biol. Chem. 264:3685-3693(1989). 
[ 2] Wistow G., Piattigorsky J. Science 236:1554-1556(1987). 

5 

185. (F-actin_cap_A) F-actin capping protein alpha subunit signatures 
The F-actin capping protein binds in a calcium-independent manner to the fast growing ends 
of actin filaments (barbed end) thereby blocking the exchange of subunits at these ends. 
1 0 Unlike gelsolin and severin this protein does not sever actin filaments. The F-actin capping 
protein is a heterodimer composed of two unrelated subunits: alpha and beta.The alpha 
subunit is a protein of about 268 to 286 amino acid residues whose sequence is well 
conserved in eukaryotic species [1]. As signature patterns two highly conserved regions in the 
C-terminal section of the alpha subunit were selected. 

15 

Consensus pattern: V-H-[FY](2)-E-D-G-N-V 
Consensus pattern: F-K-[AE]-L-R-R-x-L-P- 

[ 1] Cooper J.A., Caldwell I.E., Gattermeir D.J., Torres M.A., Amatruda J.F., Casella J.F. 
2 0 Cell Motil. Cytoskeleton 18:204-214(1991). 



186. F-box domain 

[1] Bai C, Sen P, Hofmann K, Ma L, Goebl M, Harper JW, Elledge SJ, Cell 

2 5 1996;86:263-274. [2] Skowyra D, Craig KL, Tyers M, Elledge SJ, Harper JW, Cell 

1997;91:209-219. 

187. F-protein 

3 0 Negative factor, (F-Protein) or Nef . 

[1] Arold S, Franken P, Strub M-P, Hoh F, Benichou S, Benarous R, Dumas C; Medline: 
98035457, The crystal structure of HIV-1 Nef protein bound to the Fyn kinase SH3 domain 
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suggests a role for this complex in altered T cell receptor signalling Structure 1997;5:1361- 
1372. 

Nef protein accelerates virulent progression of AIDS by its interaction with cellular proteins 
5 involved in signal transduction and host cell activation. Nef has been shown to bind 
specifically to a subset of the Src kinase family. 

Number of members : 1013 

10 

188. (FAD_binding_2) 

Fumarate reductase / succinate dehydrogenase FAD-binding site 

In bacteria two distinct, membrane-bound, enzyme complexes are responsible for the 
15 interconversion of fumarate and succinate (EC 1.3.99.1): fumarate reductase (Frd) is used in 
anaerobic growth, and succinate dehydrogenase (Sdh) is used in aerobic growth. Both 
complexes consist of two main components: a membrane -extrinsic component composed of a 
FAD-binding flavoprotein and an iron-sulfur protein; and an hydrophobic component 
composed of a membrane anchor protein and/or a cytochrome B. 

20 

In eukaryotes mitochondrial succinate dehydrogenase (ubiquinone) (EC 1.3.5.1) is an enzyme 
composed of two subunits: a FAD flavoprotein and and iron-sulfur protein. 

The flavoprotein subunit is a protein of about 60 to 70 Kd to which FAD is covalently bound 
25 to a histidine residue which is located in the N-terminal section of the protein [1]. The 

sequence around that histidine is well conserved in Frd and Sdh from various bacterial and 
eukaryotic species [2] and can be used as a signature pattern. 

Consensus patternR-[ST]-H-[ST]-x(2)-A-x-G-G [H is the FAD binding site] Sequences 
3 0 known to belong to this class detected by the pattern ALL. 

[ 1] Blaut M., Whittaker K., Valdovinos A., Ackrell B.A., Gunsalus R.P., Cecchini G. J. Biol. 
Chem. 264:13599-13604(1989). 
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[ 2] Birch-Machin M.A., Farnsworth L., Ackrell B.A., Cochran B., Jackson S., Bindoff L.A., 
Aitken A., Diamond A.G., Turnbull D.M. J. Biol. Chem. 267:11553-11558(1992). 



5 189. Fatty acid desaturases signatures (FA_desaturase) 

Fatty acid desaturases (EC 1.14.99.-) are enzymes that catalyze the insertion of a double bond 
at the delta position of fatty acids. There seems to be two distinct families of fatty acid 
desaturases which do not seem to be evolutionary related. Family 1 is composed of: - 
Stearoyl-CoA desaturase (SCD) (EC 1.14.99.5 ^ [1]. SCD is a key regulatory enzyme of 

1 0 unsaturated fatty acid biosynthesis. SCD introduces a cis double bond at the delta(9) position 
of fatty acyl-CoA's such as palmitoleoyl- and oleoyl-CoA. SCD is a membrane-bound 
enzyme that is thought to function as a part of a multienzyme complex in the endoplasmic 
reticulum of vertebrates and fungi. As a signature pattern for this family a conserved region 
in the C-terminal part of these enzymes was selected, this region is rich in histidine residues 

15 and in aromatic residues. Family 2 is composed of: - Plants stearoyl-acyl-carrier-protein 

desaturase (EC 1.14.99.6 ) [2], these enzymes catalyze the introduction of a double bond at 
the delta(9) position of steraoyl-ACP to produce oleoyl-ACP. This enzyme is responsible for 
the conversion of saturated fatty acids to unsaturated fatty acids in the synthesis of vegetable 
oils. - Cyanobacteria desA [3] an enzyme that can introduce a second cis double bond at the 

2 0 delta(12) position of fatty acid bound to membranes glycerolipids. DesA is involved in 

chilling tolerance; the phase transition temperature of lipids of cellular membranes being 
dependent on the degree of unsaturation of fatty acids of the membrane lipids. As a signature 
pattern for this family a conserved region in the C-terminal part of these enzymes was 
selected. 

25 

Consensus pattern: G-E-x-[FY]-H-N-[FY]-H-H-x-F-P-x-D-Y- 

Consensus pattern: [ST]-[SA]-x(3)-[QR]-[LI]-x(5,6)-D-Y-x(2)-[LIVMFYW]-[LIVM]- [DE]- 

[ 1] Kaestner K.H., Ntambi J.M., Kelly T.J. Jr., Lane M.D. J. Biol. Chem. 264:14755- 

3 0 14761(1989). 

[ 2] Shanklin J., Somerville C.R. Proc. Natl. Acad. Sci. U.S.A. 88:2510-2514(1991). 
[ 3] Wada H., Gombos Z., Murata N. Nature 347:200-203(1990). 
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190. Fructose-l-6-bisphosphatase active site (FBPase) 

Fructose-l,6-bisphosphatase (EC 3.1.3.11 ') (FBPase) [1], a regulatory enzyme in 
gluconeogenesis, catalyzes the hydrolysis of fructose 1,6-bisphosphate to fructose 6- 
5 phosphate. It is involved in many different metabolic pathways and found in most 

organisms.Sedoheptulose-l,7-bisphosphatase (EC 3.1.3.37 ) (SBPase) [2] is an enzyme found 
plant chloroplast and in photosynthetic bacteria that catalyzes the hydrolysis of sedoheptulose 
1,7-bisphosphate to sedoheptulose 7-phosphate, a step in the Calvin's reductive pentose 
phosphate cycle. It is functionally and structurally related to FBPase. In mammalian FBPase, 
10 a lysine residue has been shown to be involved in the catalytic mechanism [3]. The region 

around this residue is highly conserved and can be used as a signature pattern for FBPase and 
SBPase. It must be noted that, in some bacterial FBPase sequences, the active site lysine is 
replaced by an arginine 

15 Consensus pattern: [AG]-[RK]-L-x(L2)-[LIV]-[FY]-E-x(2)-P-[LIVM]-[GSA] [K/R is the 
active site residue] - 

[ 1] Benkovic S.J., DeMaine M.M. Adv. Enzymol. 53:45-82(1982). 
[ 2] Raines C.A., Lloyd J.C., Willingham N.M., Potts S., Dyer T.A. Eur. J. Biochem. 
2 0 205:1053-1059(1992). 

[ 3] Ke H., Thorpe CM., Seaton B.A., Lipscomb W.N., Marcus F. J. Mol. Biol. 212:513- 
539(1989). 

2 5 191. FGGY family of carbohydrate kinases signatures * 

It has been shown [1] that four different type of carbohydrate kinases seem to be evolutionary 
related. These enzymes are: - L-fucolokinase (EC 2.7.1.51 ') (gene fucK). - Gluconokinase 
(EC 2.7.1.12 ) (gene gntK). - Glycerokinase (EC 2.7.1.30 ) (gene glpK). - Xylulokinase (EC 
2.7.1.17 ) (gene xylB). - L-xyluIose kinase (EC 2.7.1.53 ) (gene lyxK).These enzymes are 

3 0 proteins of from 480 to 520 amino acid residues. As consensus patterns for this family of 

kinases two conserved regionswere selected, one in the central section, the other in the C- 
terminal section. 
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Consensus pattern: [MFYGS]-x-[PST]-x(2)-K-[LIVMFYW]-x-W-[LIVMF]-x-[DENQTKR]- 
[ENQH]- 

Consensus pattern: [GSA]-x-[LIVMFYW]-x-G-[LIVM]-x(7,8)-[HDENQ]-[LIVMF]-x(2)- 
[AS]-[STAIVM]-[LIVMFY]-[DEQ]- 

5 

[ 1] Reizer A., Deutscher J., Saier M.H. Jr., Reizer J. Mol. Microbiol. 5:1081-1089(1991). 

192. FKBP-type peptidyl-prolyl cis-trans isomerase signatures/profile (FKBP) 
1 0 FICBP [1,2,3] is the major high-affinity binding protein, in vertebrates, for the 

immunosuppressive drug FK506. It exhibits peptidyl-prolyl cis-trans isomerase activity (EC 
5-2.1.8 ) (PPIase or rotamase). PPIase is an enzyme that accelerates protein folding by 
catalyzing the cis-trans isomerization of proline imidic peptide bonds in oligopeptides [4] .At 
least three different forms of FKBP are known in mammalian species: - FKBP-12, which is 
1 5 cytosolic and inhibited by both FK506 and rapamycin. - FKBP-13, which is membrane 

associated and inhibited by both FK506 and rapamycin. - FKBP-25, which is preferentially 
inhibited by rapamycin. These forms of FKBP are evolutionary related and show extensive 
similarities[5,6,7] with the following proteins: - Fungal FKBP. - Mammalian hsp binding 
immunophilin (HBI) (also called p59). HBI is a protein which binds to hsp90 and contains 
2 0 two FKBP -like domains in its N- terminal section - the first of which seems to be functional. 
- The C-terminal part of the cell-surface protein mip from Legionella; a protein associated 
with macrophage infection by an unknown mechanism. - Escherichia coli slyD [8], a protein 
with a N-terminal FKBP domain followed by an histidine-rich metal-binding domain. - 
Escherichia coli fkpA. - Escherichia coli fklB (FKBP22). - Escherichia coli slpA. - Bacterial 

2 5 trigger factor (Tig). - Streptomyces hygroscopus and chrysomallus FK506-binding protein. - 

Chlamydia trachomatis 27 Kd membrane protein. - Neisseria meningitidis strain CI 14 
PPiase. - Probable PPiases from Haemophilus influenzae (HI0754), Methanococcus 
jannaschii (MJ0278 and MJ0825), Pseudomonas fluorescens and Pseudomonase aeruginosa. 
Two signature patterns for these proteins were developed. One is based on a conserved region 

3 0 in the N-terminus of FKBP, the other is located in the central section. The profile for FKBP 

spans the complete domain. 
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Consensus pattern: [LIVMC]-x-[YF]-x-[GVL]-x(l,2)-[LFT]-x(2)-G-x(3)-[DE]- [STAEQK]- 
[STAN]- 

Consensus pattern: [LIVMFY]-x(2)-[GA]-x(3,4)-[LIVMF]-x(2)-[LIVMFHK]-x(2)-G- x(4)- 
[LIVMF]-x(3)-[PSGAQ]-x(2)-[AG]-[FY]-G- 

5 

[ 1] Tropschug M., Wachter E., Mayer S., Schoenbrunner E.R., Schmid F.X. Nature 346:674- 
677(1990). 

[ 2] Stein R.L. Curr. Biol. 1:234-236(1991). 

[ 3] Siekierka J.J., Widerrecht G., Greulich H., Boulton D., Hung S.H.Y., Cryan J., Hodges 
1 0 P.J., Sigal N.H. J. Biol. Chem. 265:21011-21015(1990). 

[ 4] Fischer G., Schmid F.X. Biochemistry 29:2205-2212(1990). 

[ 5] Trandinh C.C., Pao G.M., Saier M.H. Jr. FASEB J. 6:3410-3420(1992). 

[ 6] Galat A. Eur. J. Biochem. 216:689-707(1993). 

[ 7] Hacker J., Fischer G. Mol. Microbiol. 10:445456(1993). 
15 [8] Wuelfing C., Lomardero J., Plueckthun A. J. Biol. Chem. 269:2895-2901(1994). 

193. MAPEG family (aka: FLAP/GST2/LTC4S family signature) 

The following mammalian proteins are evolutionary related [1]: 
2 0 - Leukotriene C4 synthase (EC 2.5.1.37) (gene LTC4S), an enzyme that catalyzes 

the production of LTC4 from LTA4. 

- Microsomal glutathione S-transferase II (EC 2.5.1.18) (GST-II) (gene GST2), an 
enzyme that can also produces LTC4 fron LTA4. 

- 5-lipoxygenase activating protein (gene FLAP), a protein that seems to be 
25 required for the activation of 5-lipoxygenase. 

These are proteins of 150 to 160 residues that contain three transmembrane segments. 
As a signature pattern, a conserved region between the first and second transmembrane 
domains was selected. 

30 Consensus patternc: G-x(3)-F-E-R-V-[FY]-x-A-[NQ]-x-N-C 

[1] Jakobsson P. -J., Mancini J.A., Ford-Hutchinson A.W. J. Biol. Chem. 271:22203- 
22210(1996). 
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194. FMN-dependent alpha-hydroxy acid dehydrogenases active site (FMN_dh) 
A number of oxidoreductases that act on alpha-hydroxy acids and which are FMN-containing 
flavoproteins have been shown [1,2,3] to be structurally related; these enzymes are: - Lactate 
5 dehydrogenase (EC 1.1.2.3 V which consists of a dehydrogenase domain and a heme-binding 
domain called cytochrome b2 and which catalyzes the conversion of lactate into pyruvate. - 
Glycolate oxidase (EC 1.1.3.15 ) ((S)-2-hydroxy-acid oxidase), a peroxisomal enzyme that 
catalyzes the conversion of glycolate and oxygen to glyoxylate and hydrogen peroxide. - 
Long chain alpha-hydroxy acid oxidase from rat (EC 1.1.3.15 ). a peroxisomal enzyme. - 

1 0 Lactate 2-monooxygenase (EC 1.13.12.4 ) (lactate oxidase) from Mycobacterium smegmatis, 
which catalyzes the conversion of lactate and oxygen to acetate, carbon dioxide and water. - 
(S)-mandelate dehydrogenase from Pseudomonas putida (gene mdlB), which catalyzes the 
reduction of (S)-mandelate to benzoylformate. The first step in the reaction mechanism of 
these enzymes is the abstraction of the proton from the alpha-carbon of the substrate 

1 5 producing a carbanion which can subsequently attach to the N5 atom of FMN. A conserved 
histidine has been shown [4] to be involved in the removal of the proton. The region around 
this active site residue is highly conserved and contains an arginine residue which is involved 
in substrate binding. 

2 0 Consensus pattern: S-N-H-G-[AG]-R-Q [H is the active site residue] [R is a substrate-binding 
residue] - 

[ 1] Giegel D.A., Williams C.H. Jr., Massey V. J. Biol. Chem. 265:6626-6632(1990). 
[ 2] Tsou A.Y., Ransom S.C., Gerlt J.A., Buechter D.D., Babbitt P.C., Kenyon G.L. 

2 5 Biochemistry 29:9856-9862(1990). 

[ 3] Le K.H.D., Lederer F. J. Biol. Chem. 266:20877-20880(1991). 
[ 4] Lindqvist Y., Branden C.-I. J. Biol. Chem. 264:3624-3628(1989). 

3 0 195. Flavin-binding monooxygenase-like (FMO-like) 

This family includes FMO proteins, cyclohexanone monooxygenase 
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196. (FPGS) 

Folylpolyglutamate synthase signatures (aka Mur_ligase) 

Folylpolyglutamate synthase (EC 6.3.2.17) (FPGS) [1] is the enzyme of folate metabolism 
5 that catalyzes ATP-dependent addition of glutamate moieties to tetrahydrofolate. 

Its sequence is moderately conserved between prokaryotes (gene folC) and eukaryotes. We 
developed two signature patterns based on the conserved regions which are rich in glycine 
residues and could play a role in the catalytical activity and/or in substrate binding. 

10 

Consensus pattern [LIVMFY]-x-[LIVM]-[STAG]-G-T-[NK]-G-K-x-[ST]-x(7)- [LIVM](2)- 
x(3)-[GSK] Sequences known to belong to this class detected by the pattern ALL. 

Consensus pattern[LIVMFY](2)-E-x-G-[LIVM]-[GA]-G-x(2)-D-x4GST]-x-[LIVM](2) 
1 5 Sequences known to belong to this class detected by the pattern ALL. 

[ 1] Shane B., Garrow T., Brenner A., Chen L., Choi Y.J., Hsu J.C., Stover P. Adv. Exp. 
Med. Biol. 338:629-634(1993). 

□ 

20 

197. FYVE zinc finger 

The FYVE zinc finger is named after four proteins that it has been found in: Fabl, 
YOTB/ZK632.12, Vacl, and EEAl. The FYVE finger has been shown to bind two Zn++ 

2 5 ions [1]. The FYVE finger has eight potential zinc coordinating cysteine positions. Many 

members of this family also include two histidines in a motif R+HHC+XCG, where + 
represents a charged residue and X any residue. Members were included which do not 
conserve these histidine residues but are clearly related. 

[1] Stenmark H, Aasland R, Toh BH, D'Arrigo A, J Biol Chem 1996;271:24048- 

3 0 24054. [2] Gaullier JM, Simonsen A, D Arrigo A, Bremnes B, Stenmark H, Aasland R, 

Nature 1998;394:432-433. 
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198. F_actin_cap_B 

F-actin capping protein beta subunit signature 

Tlie F-actin capping protein binds in a calcium-independent manner to tlie fast growing ends 
5 of actio filaments (barbed end) thereby blocking the exchange of subunits at these ends. 

Unlike gelsolin and severin this protein does not sever actin filaments. The F-actin capping 
protein is a heterodimer composed of two unrelated subunits: alpha and beta. 

The beta subunit is a protein of about 280 amino acid residues whose sequence is well 
1 0 conserved in eukaryotic species [1]. As a signature pattern a conserved hexapeptide in the N- 
terminal section of the beta subunit was selected. 

Consensus pattern: C-D-Y-N-R-D Sequences known to belong to this class detected by the 
pattern ALL. 

15 

[ 1] Amatruda J.F., Cannon J.F., Tatchell K., Hug C, Cooper J.A. Nature 344:352-354(1990). 

199. Isopenicillin N synthetase signatures (Fe_Asc_oxidored) 

2 0 Isopenicillin N synthetase (IPNS) [1,2] is a key enzyme in the biosynthesis of penicillin and 

cephalosporin. In the presence of oxygen, it removes iron and ascorbate, four hydrogen atoms 
from L-(alpha-aminoadipyl)-L-cysteinyl-d-valine to form the azetidinone and thiazolidine 
rings of isopenicillin. IPNS is an enzyme of about 330 amino-acid residues. Two cysteines 
are conserved in fungal and bacterial IPNS sequences; these may be involved in iron-binding 
25 and/or substrate -binding. Cephalosporium acremonium DAOCS/DACS [3] is a bifunctional 
enzyme involved in cephalosporin biosynthesis. The DAOCS domain, which is structurally 
related to IPNS, catalyzes the step from penicillin N to deacetoxy-cephalosporin C - used as a 
substrate by DACS to form deacetylcephalosporin C. Streptomycesclavuligerus possesses a 
monofunctional DAOCS enzyme (gene cefE) [4] also related to IPNS. Two signature patterns 

3 0 for these enzymes were derived, centered around the conserved cysteine residues. 

Consensus pattern: [RK]-x-[STA]-x(2)-S-x-C-Y-[SL]- 

Consensus pattern: [LIVM](2)-x-C-G-[STA]-x(2)-[STAG]-x(2)-T-x-[DNG]- 
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[ 1] Martin J.F. Trends Biotechnol. 5:306-308(1987). 

[ 2] Chen G., Shiffman D., Mevarech M., Aharonowitz Y. Trends Biotechnol. 8:105- 
111(1990). 

5 [ 3] Samson S.M., Dotzlaf J.E., Slisz M.L., Becker G.W., van Frank R.M., Veal L.E., Yeh 
W.K., Miller J.R., Queener S.W., Ingolia T.D. Bio/Technology 5:1207-1214(1987). 
[ 4] Kovacevic S., Weigel B.J., Tobin M.B., Ingolia T.D., Miller J.R. J. Bacteriol. 171:754- 
760(1989). 

10 

200. Fibrillarin signature 

Fibrillarin [1] is a component of a nucleolar small nuclear ribonucleoprotein(SnRNP) particle 
thought to participate in the first step of the processing of pre-rRNA. In mammals, fibrillarin 
is associated with the U3, U8 and U13small nuclear RNAs [2]. Fibrillarin is an extremely 
1 5 well conserved protein of about 320 amino acid residues. Structurally it consists of three 
different domains: - An N-terminal domain of about 80 amino acids which is very rich in 
glycine and contains a number of dimethylated arginine residues (DMA). - A central domain 
of about 90 residues which resembles that of RNA-binding proteins and contains an 
octameric sequence similar to the RNP-2 consensus found in such proteins. - A C-terminal 

2 0 alpha-helical domain. A protein evolutionary related to fibrillarin has been found [3] in 

archaebacteria such as Methanococcus vannielii or voltae. This protein (geneflpA) is 
involved in pre-rRNA processing. It lacks the Giy/Arg-rich N-terminal domain. As a 
signature pattern, a region was selected that starts with and encompases theRNP-2 like 
octapeptide sequence. 

25 

Consensus pattern: [GST]-[LIVMAP]-V-Y-A-[IV]-E-[FY]-[SA]-x-R-x(2)-R-[DE] - 

[ 1] Aris J.P., Blobel G. Proc. Natl. Acad. Sci. U.S.A. 88:931-935(1991). 

[ 2] Bandziulis R.J., Swanson M.S., Dreyfuss G. Genes Dev. 3:431-437(1989). 

3 0 [3] Agha-Amiri K. J. Bacteriol. 176:2124-2127(1994). 



201. Filamin/ABP280 repeat 
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[1] Fucini P, Renner C, Herberhold C, Noegel AA, Holak TA, Nat Struct Biol 
1997;4:223-230. 



202. Fucosyl transferase 

This family of Fucosyltransferases are the enzymes transferring 
fucose from GDP-Fucose to GlcNAc in an alphal,3 linkage [1]. 

[1] Breton C, Oriol R, Imberty A; Glycobiology 1998;8:87-94. 



203. 2Fe-2S ferredoxins, iron-sulfur binding region signature (fer2A) 

Ferredoxins [1] are a group of iron-sulfur proteins which mediate electron transfer in a wide 
variety of metabolic reactions. Ferredoxins can be divided into several subgroups depending 
upon the physiological nature of the iron sulfur cluster(s) and according to sequence 
similarities. One of these subgroups are the 2Fe-2S ferredoxins, which are proteins or 
domains of around one hundred amino acid residues that bind a single 2Fe-2S iron-sulfur 
cluster. The proteins that are known [2] to belong to this family are listed below. - Ferredoxin 
from photosynthetic organisms; namely plants and algae where it is located in the chloroplast 
or cyanelle; and cyanobacteria. - Ferredoxin from archaebacteria of the Halobacterium genus. 
- Ferredoxin IV (gene pftA) and V (gene fdxD) from Rhodobacter capsulatus. - Ferredoxin in 
the toluene degradation operon (gene xylT) and naphthalene degradation operon (gene nahT) 
of Fseudomonas putida. - Hypothetical Escherichia coli protein yfaE. - The N-terminal 
domain of the bifunctional ferredoxin/ferredoxin reductase electron transfer component of the 
benzoate 1,2-dioxygenase complex (gene benC) from Acinetobacter calcoaceticus, the 
toluene 4-monooxygenase complex (gene tmoF), the toluate 1,2-dioxygenase system (gene 
xylZ), and the xylene monooxygenase system (gene xylA) from Fseudomonas. - The N- 
terminal domain of phenol hydroxylase protein p5 (gene dmpP) from Fseudomonas Putida. - 
The N-terminal domain of methane monooxygenase component C (gene mmoC) from 
Methylococcus capsulatus . - The C-terminal domain of the vanillate degradation pathway 
protein vanB in a Fseudomonas species. - The N-terminal domain of bacterial fumarate 
reductase iron-sulfur protein (gene frdB). - The N-terminal domain of CDP-6-deoxy-3,4- 
glucoseen reductase (gene ascD) from Yersinia pseudotuberculosis. - The central domain of 
eukaryotic succinate dehydrogenase (ubiquinone) iron- sulfur protein. - The N-terminal 
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domain of eukaryotic xanthine deiiydrogenase. - The N-terminal domain of eukaryotic 
aldehyde oxidase. In the 2Fe-2S ferredoxins, four cysteine residues bind the iron-sulfur 
cluster. Three of these cysteines are clustered together in the same region of the protein. Our 
signature pattern spans that iron-sulfur binding region. 

5 

Consensus pattern: C-{C}-{C}-[GA]-{C}-C-[GAST]-{CPDEKRHFYW}-C [The three Cs 
are 2Fe-2S ligands]- 

[ 1] Meyer J. Trends Ecol. EvoL 3:222-226(1988).[ 2] Harayama S., Polissi A., Rekik M. 
1 0 FEES Lett. 285:85-88(1991). 

Adrenodoxin family, iron-sulfur binding region signature (fer2B) 

Ferredoxins [1] are a group of iron-sulfur proteins which mediate electron transfer in a wide 
variety of metabolic reactions. Ferredoxins can be divided into several subgroups depending 

1 5 upon the physiological nature of the iron sulfur cluster(s) and according to sequence 

similarities. One family of ferredoxins groups together the following proteins that all bind a 
single 2Fe-2S iron-sulfur cluster: - Adrenodoxin (ADX) (adrenal ferredoxin), a vertebrate 
mitochondrial protein which transfers electrons from adrenodoxin reductase to cytochrome 
P450scc, which is involved in cholesterol side chain cleavage. - Putidaredoxin (PTX), a 

2 0 Pseudomonas putida protein which transfers electrons from putidaredoxin reductase to 

cytochrome P450-cam, which is involved in the oxidation of camphor. - Terpredoxin [2], a 
Pseudomonas protein which transfers electrons from terpredoxin reductase to cytochrome 
P450-terp, which is involved in the oxidation of alpha-terpineol. - Rhodocoxin [3], a 
Rhodococcus protein which transfers electrons from rhodocoxin reductase to cytochrome 

2 5 CYP116 (thcB), which is involved in the degradation of thiocarbamate herbicides. - 

Escherichia coli ferredoxin (gene fdx) [4] whose exact function is not yet known. - 
Rhodobacter capsulatus ferredoxin VI [5], which may transfer electrons to a yet 
uncharacterized oxygenase. - Caulobacter crescentus ferredoxin (gene fdxB) [6]. In these 
proteins, four cysteine residues bind the iron-sulfur cluster. Three of these cysteines are 

3 0 clustered together in the same region of the protein. Our signature pattern spans that iron- 

sulfur binding region. 
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Consensus pattern: C-x(2)-[STAQ]-x-[STAMV]-C-[STA]-T-C-[HR] [The three C's are 2Fe- 
2S ligands]- 

[ 1] Meyer J. Trends Ecol. Evol. 3:222-226(1988). 
5 [ 2] Peterson J.A., Lu J.-Y., Geisselsoder J., Graham-Lorence S., Carmona C, Witney F., 
Lorence M.C. J. Biol. Chem. 267:14193-14203(1992). 

[ 3] Nagy I., Schoofs G., Compernolle P., Proost P., Vanderleyden J., De Mot R. J. Bacterid. 
177:676-687(1995). 

[ 4] Ta D.T., Vickery L.E. J. Biol. Chem. 267:11120-11125(1992). 
10 [5] Naud I., Vincon M., Garin J., Gaillard J., Forest E., Jouanneau Y. Eur. J. Biochem. 
222:933-939(1994). 

[ 6] Amemiya K EMBL/Genbank: X51607. 



15 204. 4Fe-4S ferredoxins, iron-sulfur binding region signature (fer4) 

Ferredoxins [1] are a group of iron-sulfur proteins which mediate electron transfer in a wide 
variety of metabolic reactions. Ferredoxins can be divided into several subgroups depending 
upon the physiological nature of the iron-sulfur cluster(s). One of these subgroups are the 
4Fe-4S ferredoxins, which are found in bacteria and which are thus often referred as 

2 0 'bacterial-type' ferredoxins. The structure of these proteins [2] consists of the duplication of a 
domain of twenty six amino acid residues; each of these domains contains four cysteine 
residues that bind to a 4Fe-4S center. A number of proteins have been found [3] that include 
one or more 4Fe-4Sbinding domains similar to those of bacterial-type ferredoxins. These 
proteins are listed below (references are only provided for recently determined sequences). - 

2 5 The iron-sulfur proteins of the succinate dehydrogenase and the fumarate reductase 

complexes (EC 1.3.99.1 ). These enzyme complexes, which are components of the 
tricarboxylic acid cycle, each contain three subunits: a flavoprotein, an iron-sulfur protein, 
and a b-type cytochrome. The iron- sulfur proteins contain three different iron-sulfur centers: 
a 2Fe-2S, a 3Fe-3S and a 4Fe-4S. - Escherichia coli anaerobic glycerol-3-phosphate 

3 0 dehydrogenase (EC* 1. 1.99.5 ) This enzyme is composed of three subunits: A, B, and C. The C 

subunit seems to be an iron-sulfur protein with two ferredoxin-like domains in the N- 
terminal part of the protein. - Escherichia coli anaerobic dimethyl sulfoxide reductase. The B 
subunit of this enzyme (gene dmsB) is an iron-sulfur protein with four 4Fe-4S ferredoxin-like 
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domains. - Escherichia coli formate hydrogenlyase. Two of the subunits of this oligomeric 
complex (genes hycB and hycF) seem to be iron-sulfur proteins that each contain two 4Fe-4S 
ferredoxin-like domains. - Methanobacterium formicicum formate dehydrogenase (EC 
1-2.1.2 ). This enzyme is used by the archaebacteria to grow on formate. The beta chain of this 
5 dimeric enzyme probably binds two 4Fe-4S centers. - Escherichia coli formate 

dehydrogenases N and O (EC 1.2.1.2 Y The beta chain of these two enzymes (genes fdnH and 
fdoH) are iron-sulfur proteins with four 4Fe-4S ferredoxin-like domains. - Desulfovibrio 
periplasmic [Fe] hydrogenase (EC 1.18.99.1 ). The large chain of this dimeric enzyme binds 
three 4Fe-4S centers, two of which are located in the ferredoxin-like N-terminal region of the 

1 0 protein. - Methanobacterium thermoautrophicum methyl viologen-reducing hydrogenase 
subunit mvhB, which contains six tandemly repeated ferredoxin-like domains and which 
probably binds twelve 4Fe-4S centers. - Salmonella typhimurium anaerobic sulfite reductase 
(EC 1.8.1.-) [4]. Two of the subunits of this enzyme (genes asrA and asrC) seem to both bind 
two 4Fe-4S centers. - A Ferredoxin-like protein (gene fixX) from the nitrogen-fixation genes 

15 locus of various Rhizobium species, and one from the Nif-region of Azotobacter species. - 
The 9 Kd polypeptide of chloroplast photosystem I [5] (gene psaC). This protein contains 
two low potential 4Fe-4S centers, referred as the A and B centers. - The chloroplast frxB 
protein which is predicted to carry two 4Fe-4S centers. - An ferredoxin from a primitive 
eukaryote, the enteric amoeba Entamobea histolytica. - Escherichia coli hypothetical protein 

2 0 yjj W, a protein with a N-terminal region belonging to the radical activating enzymes family 

(see <PDOC00834>) and two potential 4Fe-4S centers.The pattern of cysteine residues in the 
iron-sulfur region is sufficient todetect this class of 4Fe-4S binding proteins. 

Consensus pattern: C-x(2)-C-x(2)-C-x(3)-C-[PEG] [The four Cs are 4Fe-4S ligands]- 

25 

[ 1] Meyer J. Trends Ecol. Evol. 3:222-226(1988). 

[ 2] Otaka E., Ooi T. J. Mol. Evol. 26:257-267(1987). 

[ 3] Beinert H. FASEB J. 4:2483-2492(1990). 

[ 4] Huang C.J., Barrett E.L. J. Bacteriol. 173:1544-1553(1991). 

3 0 [5] Knaff D.B. Trends Biochem. Sci. 13:460-461(1988). 

205. NifH/frxC family signatures (fer4_NifH) 



Reference No. 2750-942P 



227 

Nitrogenase (EC 1.18.6.1 ) [1] is the enzyme system responsible for biological nitrogen 
fixation. Nitrogenase is an oligomeric complex which consists of two components: 
component 1 which contains the active site for the reduction of nitrogen to ammonia and 
component 2 (also called the iron protein). Component 2 is a homodimer of a protein (gene 
5 nifH) which binds a single 4Fe-4S iron sulfur cluster [2], In the nitrogen fixation process nifH 
is first reduced by a protein such as ferredoxin; the reduced protein then transfers electrons to 
component 1 with the concomitant consumption of ATP .A number of proteins are known to 
be evolutionary related to nifH. These proteins are: - Chloroplast encoded frxC (or chlL) 
protein [3]. FrxC is encoded on the chloroplast genome of some plant species, its exact 

1 0 function is not known, but it could act as an electron carrier in the conversion of 

protochlorophyllide to chlorophyllide. - Rhodobacter capsulatus proteins bchL and bchX [4]. 
These proteins are also likely to play a role in chlorophyll synthesis. There are a number of 
conserved regions in the sequence of these proteins: in the N-terminal section there is an 
ATP -binding site motif A' (P-loop) and in the central section there are two conserved 

1 5 cysteines which have been shown, in nifH, to be the ligands of the 4Fe-4S cluster. Two 
signatures patterns that correspond to the regions around these cysteines were developed. 

Consensus pattern: E-x-G-G-P-x(2)-[GA]-x-G-C-[AG]-G [C binds the iron-sulfur center]- 
Consensus pattern: D-x-L-G-D-V-V-C-G-G-F-[AG]-x-P [C binds the iron-sulfur center]- 

20 

[ 1] Pau R.N. Trends Biochem. Sci. 14:183-186(1989). 

[ 2] Georgiadis M.M., Komiya H., Chakrabarti P., Woo D., Kornuc J.J., Rees D.C. Science 
257:1653-1659(1992). 

[ 3] Fujita Y., Takahashi Y., Kohchi T., Ozeki H., Ohyama K., Matsubara H. Plant Mol. Biol. 
25 13:551-561(1989). 

[ 4] Burke D.H., Alberti M., Hearst I.E. J. Bacteriol. 175:2407-2413(1993). 

206. Ferritin iron-binding regions signatures 
3 0 Ferritin [1,2] is one of the major non-heme iron storage proteins. It consists of a mineral core 
of hydrated ferric oxide, and a multi-subunit protein shell which englobes the former and 
assures its solubility in an aqueous environment. In animals the protein is mainly cytoplasmic 
and there are generally two or more genes that encodes for closely related subunits (in 
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mammals there are two subunits which are known as H(eavy) and L(ight)). In plants ferritin 
is found in the chloroplast [3]. There are a number of well conserved region in the sequence of 
ferritins. Two of these regions to develop signature patterns were selected. The first pattern is 
located in the central part of the sequence of ferritin and it contains three conserved glutamate 
5 which are thought to be involved in the binding of iron. The second pattern is located in the 
C-terminal section, it corresponds to a region which forms a hydrophilic channel through 
which small molecules and ions can gain access to the central cavity of the molecule; this 
pattern also includes conserved acidic residues which are potential metal-binding sites. 

1 0 Consensus pattern: E-x-[KR]-E-x(2)-E-[KR]-[LF]-[LIVMA]-x(2)-Q-N-x-R-x-G-R [The 3 E's 
are potential iron ligands]- 

Consensus pattern: D-x(2)-[LIVMF]-[STAC]-[DH]-F-[LI]-[EN]-x(2)-[FY]-L-x(6)- [LIVM]- 
[KN] [The second D and the E are potential iron ligands]- 

15 [1] Crichton R.R., Charloteaux-Wauters M. Eur. J. Biochem. 164:485-506(1987). 
[ 2] Theil E.G. Annu. Rev. Biochem. 56:289-315(1987). 

[ 3] Ragland M., Briat J. -P., Gagnon J., Laulhere J. -P., Massenet O., Theil E.G. J. Biol. 
Chem. 265:18339-18344(1990). 

20 

207. Intermediate filaments signature (filament) 

Intermediate filaments (IF) [1,2,3] are proteins which are primordial components of the 
cytoskeleton and the nuclear envelope. They generally form filamentous structures 8 to 14 
nm wide. IF proteins are members of a very large multigene family of proteins which has 

25 been subdivided in five major subgroups: - Type I: Acidic cytokeratins. - Type 11: Basic 

cytokeratins. - Type III: Vimentin, desmin, glial fibrillary acidic protein (GFAP), peripherin, 
and plasticin. - Type IV: Neurofilaments L, H and M, alpha-internexin and nestin. - Type V: 
Nuclear lamins A, Bl, B2 and C. All IF proteins are structurally similar in that they consist 
of: a central rod domain comprising some 300 to 350 residues which is arranged in coiled- 

3 0 coiled alpha-helices, with at least two short characteristic interruptions; a N-terminal non- 
helical domain (head) of variable length; and a C-terminal domain (tail) which is also non- 
helical, and which shows extreme length variation between different IF proteins. While IF 
proteins are evolutionary and structurally related, they have limited sequence homologies 
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except in several regions of the rod domain. A conserved region at the C-terminal extremity 
of the rod domain was used as a sequence pattern for this class of proteins. 

Consensus pattern: [IV]-x-[TACI]-Y-[RKH]-x-[LM]-L-[DE]- 

5 

[ 1] Quinlan R., Hutchison C, Lane B. Protein Prof. 2:801-952(1995). 
[ 2] Steiner P.M., Roop D.R. Annu. Rev. Biochem. 57:593-625(1988). 
[ 3] Stewart M. Curr. Opin. Cell Biol. 2:91-100(1990). 

10 

208. Flavodoxin signature 

Flavodoxins [1,E1] are electron-transfer proteins that function in various electron transport 
systems. Flavodoxins bind one FMN molecule, which serves as a redox-active prosthetic 
group. Flavodoxins are functionally interchangeable with ferredoxins. They have been 
15 isolated from prokaryotes, cyanobacteria, and some eukaryotic algae. The signature pattern 

for these proteins is derived from a conserved region in their N-terminal section, this region is 
involved in the binding of the FMN phosphate group. 

Consensus pattern: [LIV]-[LIVFY]-[FY]-x-[ST]-x(2)-[AGC]-x-T-x(3)-A-x(2)-[LIV]- 

20 

[ 1] Wakabayashi S., Kimura K., Matsubara H., Rogers L.J. Biochem. J. 263:981-984(1989). 

209. Growth factor and cytokines receptors family signatures (fn3) 

25 A number of receptors for lymphokines, hematopoeitic growth factors and growth hormone- 
related molecules have been found [1 to 5] to share a common binding domain. Receptors 
known to belong to this family are: - Cytokine receptor common beta chain. This chain is 
common to the IL-3, IL-5 and GM-CSF receptors. - Cytokine receptor common gamma 
chain. This chain is common to the IL-2, IL-4, IL-7 and IL-13 receptors. - Ciliary 

3 0 neurotrophic factor receptor (CNTFR). - Erythropoietin receptor (EPOR). - Granulocyte 

colony-stimulating factor receptor (G-CSFR). - Granulocyte-macrophage colony-stimulating 
factor receptor alpha chain (GM- CSFR). - Interleukin-2 receptor beta chain (IL2R-beta). - 
Interleukin-3 receptor alpha chain (IL3R). - lnterleukin-4 receptor alpha chain (IL4R). - 
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Interleukin-5 receptor alpha chain (IL5R). - Interleukin-6 receptor (IL6R). - Interleukin-7 
receptor alpha chain (IL7R). - Interleukin-9 receptor (IL9R). - Growth hormone receptor 
(GRHR). - Prolactin receptor (PRLR). - Thrombopoeitin receptor (TPOR).The conserved 
region constitutes all or part of the extracellular ligand-binding region and is about 200 amino 
5 acid residues long. In the N-terminal of this domain there are two pairs of cysteines known, in 

the growth hormone receptor, to be involved in disulfide bonds. + 

xxxxxxx + I C C C C Extracellular XXXXXXX Cytoplasmic | +- 

|.[ xxxxxxx + I I I I Transmembrane +-+ 4— 

+ Two patterns to detect this family of receptors were used. The first one is derived from the 
1 0 first N-terminal disulfide loop, the second is a tryptophan-rich pattern located at the C- 
terminal extremity of the extracellular region. 

Consensus pattern: C-[LVFYR]-x(7,8)-[STIVDN]-C-x-W [The two Cs are linked by a 
disulfide bond]- 
15 Consensus pattern: [STGL]-x-W-[SG]-x-W-S- 

[ 1] Bazan J.F. Biochem. Biophys. Res. Commun. 164:788-795(1989). 
[ 2] Bazan J.F. Proc. Natl. Acad. Sci. U.S.A. 87:6934-6938(1990). 
[ 3] Cosman D., Lyman S.D., Idzerda R.L., Beckmann M.P., Park L.S., Goodwin R.G., 
2 0 March C.J. Trends Biochem. Sci. 15:265-270(1990). 

[ 4] d'Andrea A.D., Fasman G.D., Lodish H.F. Cell 58:1023-1024(1989^ 

[ 5] d'Andrea A.D., Fasman G.D., Lodish H.F. Curr. Opin. Cell Biol. 2:648-651(1990). 

2 5 210. Phosphoribosylglycinamide formyltransferase active site (formyl_transf) 

Phosphoribosytglycinamide formyltransferase (EC 2.1.2.2 ) (GART) [1] catalyzes the third 
step in de novo purine biosynthesis, the transfer of a formyl group to 5'- 
phosphoribosylglycinamide. In higher eukaryotes, GART is part of a multifunctional enzyme 
polypeptide that catalyzes three of the steps of purine biosynthesis. In bacteria, plants and 

3 0 yeast, GART is a monofunctional protein of about 200 amino-acid residues. In the 

Escherichia coli enzyme, an aspartic acid residue has been shown to be involved in the 
catalytic mechanism. The region around this active site residue is well conserved in GART 
from prokaryotic and eukaryotic sources and can be used as a signature pattern. Mammalian 
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formyltetrahydrofolate dehydrogenase (EC 1.5.1.6 ^ [2] is a cytosolicenzyme responsible for 
the NADP-dependent decarboxylative reduction of lO-formyltetrahydrofolate into 
tetrahydrofolate. It is a protein of about 900 amino acids consisting of three domains; the N- 
terminal domain (200 residues) is structurally related to GARTs.Escherichia coli methionyl- 
5 tRNA formyltransferase (EC 2.1.2.9 ) (gene fmt) [3]is the enzyme responsible for modifying 
the free amino group of the aminoacylmoiety of methionyl-nA(fMet). The central part of fmt 
seems to be evolutionary related to GART's active site region. 

Consensus pattern: G-x-[STM]-[IVT]-x-[FYWVQ]-[VMAT]-x-[DEVM]-x-[LIVMY]-D-x- 
10 G- x(2)-[LIVT]-x(6)-[LIVM] [D is the active site residue] - 

[ 1] Inglese J., Smith J.M., Benkovic SJ. Biochemistry 29:6678-6687(1990). 
[ 2] Cook R.J., Lloyd R.S., Wagner C. J. Biol. Chem. 266:4965-4973(1991). 
[ 3] Guillon J.-M., Mechulam Y., Schmitter J.-M., Blanquet S., Fayat G. J. Bacteriol. 
15 174:4294-4301(1992). 

211. GIO protein signatures 

A Xenopus protein known as GIO [1] has been found to be highly conserved in a wide range 
2 0 of eukaryotic species. The function of GIO is still unknown. GIO is a protein of about 17 to 
18 Kd (143 to 157 residues) which is hydrophilic and whose C-terminal half is rich in 
cysteines and could be involved in metal-binding. As signature patterns, two of these 
cysteine-rich segments were selected. 

2 5 Consensus pattern: L-C-C-x-[KR]-C-x(4)-[DE]-x-N-x(4)-C-x-C-R-V-P- 
Consensus pattern: C-x-H-C-G-C-[KRH]-G-C-[SA]- 

[ 1] McGrew L.L., Dworkin-Rastl E., Dworkin M.B., Richter J.D. Genes Dev. 3:803- 
815(1989). 

30 



212. G-protein alpha subunit 
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G proteins couple receptors of extracellular signals to intracellular signaling 
pathways. The G protein alpha subunit binds guanyl nucleotide and is a weak GTPase. 
Number of members: 195 

5 [1] Coleman DE, Berghuis AM, Lee E, Linder ME, Gilman AG, Sprang SR, Science 
1994;265:1405-1412. 

[2] How G proteins work: a continuing story. Coleman DE, Sprang SR, Trends Biochem Sci 
1996;21:41-44. 

10 

213. Glucose-6-phosphate dehydrogenase active site (G6PD) 

Glucose-6-phosphate dehydrogenase (EC 1.1.1.49 ) (G6PD) [1] catalyzes the first step in the 
pentose pathway, the reduction of glucose-6-phosphate to gluconolactone 6-phosphate. A 
lysine residue has been identified as are active nucleophile associated with the activity of the 
15 enzyme. The sequence around this lysine is totally conserved from bacterial to mammalian 
G6PD's and can be used as a signature pattern 

Consensus pattern: D-H-Y-L-G-K-[EQK] [K is the active site residue] - 

2 0 [1] Jeffery J., Persson B., Wood I., Bergman T., Jeffery R., Joernvall H. Eur. J. Biochem. 
212:41-49(1993). 

214. GATA-type zinc finger domain 

2 5 The GATA family of transcription factors are proteins that bind to DNA sites with the 

consensus sequence (A/T)GATA(A/G), found within the regulatory region of a number of 
genes. Proteins currently known to belong to this family are: - GATA-1 [1] (also known as 
Eryfl, GF-1 or NF-El), which binds to the GATA region of globin genes and other genes 
expressed in erythroid cells. It is a transcriptional activator which probably serves as a 

3 0 general 'switch' factor for erythroid development. - GATA-2 [2], a transcriptional activator 

which regulates endothelin-1 gene expression in endothelial cells. - GATA-3 [3], a 
transcriptional activator which binds to the enhancer of the T-cell receptor alpha and delta 
genes. - GATA-4 [4], a transcriptional activator expressed in endodermally derived tissues 
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and heart. - Drosophila protein pannier (or DGATAa) (gene pnr) which acts as a repressor of 
the achaete-scute complex (as-c). - Bombyx mori BCFI [5], which regulates the expression of 
chorion genes. - Caenorhabditis elegans elt-1 and elt-2, transcriptional activators of genes 
containing the GATA region, including vitellogenin genes [6]. - Ustilago maydis urbsl [7], a 
5 protein involved in the repression of the biosynthesis of siderophores. - Fission yeast protein 
GAF2.A11 these transcription factors contain a pair of highly similar 'zinc finger' type 
domains with the consensus sequence C-x2-C-xl7-C-x2-C.Some other proteins contain a 
single zinc finger motif highly related to those of the GATA transcription factors. These 
proteins are: - Drosophila box A-binding factor (ABF) (also known as protein serpent (gene 

1 0 srp)) which may function as a transcriptional activator protein and may play a key role in the 
organogenesis of the fat body. - Emericella nidulans areA [8], a transcriptional activator 
which mediates nitrogen metabolite repression. - Neurospora crassa nit-2 [9], a 
transcriptional activator which turns on the expression of genes coding for enzymes required 
for the use of a variety of secondary nitrogen sources, during conditions of nitrogen 

15 limitation. - Neurospora crassa white collar proteins 1 and 2 (WC-1 and WC-2), which 

control expression of light-regulated genes. - Saccharomyces cerevisiae DAL81 (or UGA43), 
a negative nitrogen regulatory protein. - Saccharomyces cerevisiae GLN3, a positive nitrogen 
regulatory protein. - Saccharomyces cerevisiae GATl. - Saccharomyces cerevisiae GZF3. 

2 0 Consensus pattern: C-x-[DN]-C-x(4,5)-[ST]-x(2)-W-[HR]-[RK]-x(3)-[GN]-x(3,4)- C-N- 

[AS]-C [The four C's are zinc ligands] 

[ 1] Trainer C.D., Evans T., Felsenfeld G., Boguski M.S. Nature 343:92-96(1990). 
[ 2] Lee M.E., Temizer D.T., Clifford J.A., Quertermous T. J. Biol. Chem. 266:16188- 
25 16192(1991). 

[ 3] Ho L-C, Vorhees P., Marin N., Oakley B.K., Tsai S.-F., Orkin S.H., Leiden J.M. EMBO 
J. 10:1187-1192(1991). 

[ 4] Spieth J., Shim Y.H., Lea K., Conrad R., Blumenthal T. Mol. Cell. Biol. 11:4651- 
4659(1991). 

3 0 [5] Drevet J.R., Skeiky Y.A., latrou K. J. Biol. Chem. 269:10660-10667(1994). 

[ 6] Hawkins M.G., McGhee J.D. J. Biol. Chem. 270:14666-1467iri995Y 

[ 7] Voisard C.P.O., Wang J., Xu P., Leong S.A., McEvoy J.L. Mol. Cell. Biol. 13:7091- 

7100(1993). 
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[ 8] Arst H.N. Jr., Kudla B., Martinez-Rossi N.M., Caddick M.X., Sibley S., Davies R.W. 
Trends Genet. 5:291-291(1989). 

[ 9] Fu Y.-H., Marzluf G.A. MoL Cell Biol. 10:1056-1065(1990). 

5 

215. Glutamine amidotransferases class-I active site (GATase) 

A large group of biosynthetic enzymes are able to catalyze the removal of the ammonia group 
from glutamine and then to transfer this group to a substrate to form a new carbon-nitrogen 
group. This catalytic activity is known asglutamine amidotransferase (GATase) (EC 2.4.2.-) 

10 [1]. The GATase domain exists either as a separate polypeptidic subunit or as part of a larger 
polypeptide fused in different ways to a synthase domain. On the basis of sequence 
similarities two classes of GATase domains have been identified [2,3]: class-I(also known as 
trpG-type) and class-II (also known as purF-type). Class-I GATase domains have been found 
in the following enzymes: - The second component of anthranilate synthase (AS) (EC 

15 4.1.3.27 ) [4]. AS catalyzes the biosynthesis of anthranilate from chorismate and glutamine. 
AS is generally a dimeric enzyme: the first component can synthesize anthranilate using 
ammonia rather than glutamine, whereas component II provides the GATase activity. In some 
bacteria and in fungi the GATase component of AS is part of a multifunctional protein that 
also catalyzes other steps of the biosynthesis of tryptophan. - The second component of 4- 

2 0 amino-4-deoxychorismate (ADC) synthase (EC 4.1.3. -), a dimeric prokaryotic enzyme that 
function in the pathway that catalyzes the biosynthesis of para-aminobenzoate (PABA) from 
chorismate and glutamine. The second component (gene pabA) provides the GATase activity 
[4]. - CTP synthase (EC 6.3.4.2 ). CTP synthase catalyzes the final reaction in the 
biosynthesis of pyrimidine, the ATP-dependent formation of CTP from UTP and glutamine. 

2 5 CTP synthase is a single chain enzyme that contains two distinct domains; the GATase 

domain is in the C-terminal section [2]. - GMP synthase (glutamine-hydrolyzing) (EC 
6.3.5.2 ). GMP synthase catalyzes the ATP-dependent formation of GMP from xanthosine 5'- 
phosphate and glutamine. GMP synthase is a single chain enzyme that contains two distinct 
domains; the GATase domain is in the N-terminal section [5]. - Glutamine-dependent 

3 0 carbamoyl-phosphate synthase (EC 6.3.5.5 ) (GD-CPSase); an enzyme involved in both 

arginine and pyrimidine biosynthesis and which catalyzes the ATP-dependent formation of 
carbamoyl phosphate from glutamine and carbon dioxide. In bacteria GD-CPSase is 
composed of two subunits: the large chain (gene carB) provides the CPSase activity, while 
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the small chain (gene carA) provides the GATase activity. In yeast the enzyme involved in 
arginine biosynthesis is also composed of two subunits; CPAl (GATase), and CPA2 
(CPSase). In most eukaryotes, the first three steps of pyrimidine biosynthesis are catalyzed by 
a large multifunctional enzyme (called URA2 in yeast, rudimentary in Drosophila, and CAD 
5 in mammals). The GATase domain is located at the N-terminal extremity of this polyprotein 
[6]. - Phosphoribosylformylglycinamidine synthase II (EC 6.3.5.3 ). an enzyme that catalyzes 
the fourth step in the de novo biosynthesis of purines. In some species of bacteria, FGAM 
synthase II is composed of two subunits: a small chain (gene purQ) which provides the 
GATase activity and a large chain (gene purL) which provides the aminator activity. - The 
1 0 histidine amidotransferase hisH, an enzyme that catalyzes the fifth step in the biosynthesis of 
histidine in prokaryotes.In the second component of AS a cysteine has been shown [7] to be 
essentialfor the amidotransferase activity. The sequence around this residue is well conserved 
in all the above GATase domains and can be used as a signature pattern for class-I GATase.- 

1 5 Consensus pattern: [PAS]-[LIVMFYT]-[LIVMFY]-G-[LIVMFY]-C-[LIVMFYN]-G-x- 
[QEH]- x-[LIVMFA] [C is the active site residue]- 

[ 1] Buchanan J.M. Adv. Enzymol. 39:91-183(1973). 
[ 2] Weng M., Zalkin H. J. Bacteriol. 169:3023-3028(1987). 
2 0 [3] Nyunoya H., Lusty C.J. J. Biol. Chem. 259:9790-9798(1984). 
[ 4] Crawford LP. Annu. Rev. Microbiol. 43:567-600(1989). 

[ 5] Zalkin H., Argos P., Narayana S.V.L., Tiedeman A.A., Smith J.M. J. Biol. Chem. 
260:3350-3354(1985). 

[ 6] Davidson J.N., Chen K.C., Jamison R.S., Musmanno L.A., Kern C.B. BioEssays 15:157- 
2 5 164(1993). 

[ 7] Tso J.Y., Hermodson M.A., Zalkin H. J. Biol. Chem. 255:1451-1457(1980). 



216. Glutamine amidotransferases class-II active site (GATase_2) 
30 A large group of biosynthetic enzymes are able to catalyze the removal of the ammonia group 
from glutamine and then to transfer this group to a substrate to form a new carbon-nitrogen 
group. This catalytic activity is known as glutamine amidotransferase (GATase) (EC 2.4.2.-) 
[1]. The GATase domain exists either as a separate polypeptidic subunit or as part of a larger 
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polypeptide fused in different ways to a synthase domain. On the basis of sequence 
similarities two classes of GATase domains have been identified [2,3]: class-I(also known as 
trpG-type) and class-II (also known as purF-type). Class-II GATase domains have been 
found in the following enzymes: - Amide phosphoribosyltransferase (glutamine 
5 phosphoribosylpyrophosphate amidotransferase) (EC 2.4.2.14 ). An enzyme which catalyzes 
the first step in purine biosynthesis, the transfer of the ammonia group of glutamine to PRPP 
to form 5-phosphoribosylamine (gene purF in bacteria, ADE4 in yeast). - Glucosamine— 
fructose-6-phosphate aminotransferase (EC 2.6.1.16 ). This enzyme catalyzes a key reaction 
in amino sugar synthesis, the formation of glucosamine 6-phosphate from fructose 6- 

10 phosphate and glutamine (gene glmS in Escherichia coli, nodM in Rhizobium, GFAl in 
yeast) - Asparagine synthetase (glutamine-hydrolyzing) (EC 6.3.5.4 ). This enzyme is 
responsible for the synthesis of asparagine from aspartate and glutamine. A cysteine is 
present at the N-terminal extremity of the mature form of all these enzymes. The cysteine has 
been shown, in amido phosphoribosyltransferase [4] and in asparagine synthetase [5] to be 

15 important for the catalytic mechanism. 



Consensus pattern: <x(0,ll)-C-[GS]-[IV]-[LIVMFYW]-[AG] [C is the active site residue]- 

[ 1] Buchanan J.M. Adv. Enzymol. 39:91-183(1973). 
2 0 [2] Weng M., Zaikin H. J. Bacteriol. 169:3023-3028(1987). 

[ 3] Nyunoya H., Lusty C.J. J. Biol. Chem. 259:9790-9798(1984). 

[ 4] van Heeke G., Schuster M. J. Biol. Chem. 264:5503-5509(1989). 

[ 5] Vollmer S.J., Switzer R.L., Hermodson M.A., Bower S.G., Zaikin H. J. Biol. Chem. 

258:10582-10585(1983). 



217. GDP dissociation inhibitor (GDI) 

[1] Schalk I, Zeng K, Wu SK, Stura EA, Matteson J, Huang M, Tandon A, Wilson lA, 
Balch WE, Nature 1996;381:42-48. 

30 



218. Oxidoreductase family (GFO_IDH_MocA) 
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This family of enzymes utilise NADP or NAD. This family: is called the 
GFO/IDH/MOCA family in swiss-prot. 

[1] Kingston RL, Scopes RK, Baker EN, Structure 1996;4:1413-1428. 

5 

219. GHMP kinases putative ATP-binding domain 

The following kinases contains, in their N-terminal section, a conserved Gly/Ser-rich region 
which is probably involved in the binding of ATP [1]. These kinases are listed below. - 
Galactokinase (EC 2.7.1.6 ). - Homoserine kinase (EC 2.7.1.39 ). - Mevalonate kinase (EC 
1 0 2.7.1.36 ). - Phosphomevalonate kinase (EC 2.7.4.2 ). This group of kinases was called 
'GHMP' (from the first letter of their substrate) 

Consensus pattern: [LIVM]-[PK]-x-[GSTA]-x(0,l)-G-L-[GS]-S-S-[GSA]-[GSTAC]- 
15 [1] Tsay Y.H., Robinson G.W. Mol. Cell. Biol. 11:620-631(1991). 

220. Glucose inhibited division protein A family signatures (GIDA) 

Bacterial glucose inhibited division protein A (gene gidA) is a protein of 70Kd whose 
2 0 function is not yet known and whose sequence is highly conserved. It is evolutionary related 
to yeast hypothetical protein YGL236C, Caenorhabditis elegans hypothetical protein 
F52H3.2 and a Bacillus subtilis protein called gid (and which is different from B.subtilis 
gidA). Two highly conserved regions were selected as signature patterns. Both regions are 
located in the central region of the protein. 

25 

Consensus pattern: [GS]-[PT]-x-Y-C-P-S-[LIVM]-E-x-K-[LIVM]-x-[KR]- 

Consensus pattern: A-G-Q-x-[NT]-G-x(2)-G-Y-x-E-[SAG](3)-[QS]-G-[LIVM](2)-A-G- 

[LIVMT]-N-A- 

30 

221. (GLFV_dehydrog) 

Glu / Leu / Phe / Val dehydrogenases active site 
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- Glutamate dehydrogenases (EC 1.4.1.2, EC 1.4.1.3, and EC 1.4.1.4) (GluDH) 
are enzymes that catalyze the NAD- or NADP-dependent reversible deamination 
of glutamate into alpha-ketoglutarate [1,2]. GIuDH isozymes are generally 
involved with either ammonia assimilation or glutamate catabolism. 

5 - Leucine dehydrogenase (EC 1.4.1.9) (LeuDH) is a NAD-dependent enzyme that 
catalyzes the reversible deamination of leucine and several other aliphatic 
amino acids to their keto analogues [3]. 

- Phenylalanine dehydrogenase (EC 1.4.1.20) (PheDH) is a NAD-dependent enzyme 
that catalyzes the reversible deamidation of L-phenylalanine into phenyl- 

10 pyruvate [4]. 

- Valine dehydrogenase (EC 1.4.1.8) (ValDH) is a NADP-dependent enzyme that 
catalyzes the reversible deamidation of L-valine into 3-methyl-2- 
oxobutanoate [5]. 

15 These dehydrogenases are structurally and functionally related. A conserved lysine residue 
located in a glycine-rich region has been implicated in the catalytic mechanism. The 
conservation of the region around this residue allows the derivation of a signature pattern for 
such type of enzymes. 

2 0 Consensus pattern[LIV]-x(2)-G-G-[SAG]-K-x-[GV]-x(3)-[DNST]-[PL] [K is the active site 

residue] Sequences known to belong to this class detected by the pattern ALL. 

Note all known sequences from this family have Pro in the last position of the pattern with 
the exception of yeast GluDH which as Leu. 

25 

[ 1] Britton K.L., Baker P.J., Rice D.W., Stillman T.J. Eur. J. Biochem. 209:851-859(1992). 
[ 2] Benachenhou-Lahfa N., Forterre P., Labedan B. J. Mol. Evol. 36:335-346(1993). 
[ 3] Nagata S., Tanizawa K., Esaki N., Sakamoto Y., Ohshima T., Tanaka H., Soda K. 
Biochemistry 27:9056-9062(1988). 

3 0 [4] Takada H., Yoshimura T., Ohshima T., Esaki N., Soda K. J. Biochem. 109:371- 

376(1991). 

[ 5] Hutchinson C.R., Tang L. J. Bacteriol. 175:4176-4185(1993). 
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222. GMC oxidoreductases signatures 

The following FAD flavoproteins oxidoreductases have been found [1,2] to be evolutionary 
related. These enzymes, which are called 'GMC oxidoreductases', are listed below. - Glucose 
oxidase (EC 1.13.4 ) (GOX) from Aspergillus niger. Reaction catalyzed: glucose + oxygen -> 
delta-gluconolactone + hydrogen peroxide. - Methanol oxidase (EC 1.1.3.13 ) (MOX) from 
fungi. Reaction catalyzed: methanol + oxygen -> acetaldehyde + hydrogen peroxide. - 
Choline dehydrogenase (EC 1.1.99.1 > (CHD) from bacteria. Reaction catalyzed: choline + 
unknown acceptor -> betaine acetaldehyde + reduced acceptor. - Glucose dehydrogenase 
(GLD) (EC 1.1.99.10 ) from Drosophila. Reaction catalyzed: glucose + unknown acceptor -> 
delta-gluconolactone + reduced acceptor. - Cholesterol oxidase (CHOD) (EC 1.1.3.6 ) from 
Brevibacterium sterolicum and Streptomyces strain SA-COO. Reaction catalyzed: cholesterol 
+ oxygen -> cholest-4-en-3-one + hydrogen peroxide. - AlkJ [3], an alcohol dehydrogenase 
from Pseudomonas oleovorans, which converts aliphatic medium-chain-length alcohols into 
aldehydes. This family also includes a lyase: - (R)-mandelonitrile lyase (EC 4.1.2.10 ) 
(hydroxynitrile lyase) from plants [4], an enzyme involved in cyanogenis, the release of 
hydrogen cyanide from injured tissues. These enzymes are proteins of size ranging from 556 
(CHD) to 664 (MOX) amino acid residues which share a number of regions of sequence 
similarities. One of these regions, located in the N-terminal section, corresponds to the FAD 
ADP-binding domain. The function of the other conserved domains is not yet known; two of 
these domains were selected as signature patterns. The first one is located in the N-terminal 
section of these enzymes, about 50 residues after the ADP-binding domain, while the second 
one is located in the central section. 

Consensus pattern: [GA]-[RKN]-x-[LIV]-G(2)-[GST](2)-x-[LIVM]-N-x(3)-[FYWA]- x(2)- 
[PAG]-x(5)-[DNESH]- 

Consensus pattern: [GS]-[PSTA]-x(2)-[ST]-P-x-[LIVM](2)-x(2)-S-G-[LIVM]-G- 

[ 1] Cavener D.R. J. Mol. Biol. 223:811-814(1992). 

[ 2] Henikoff S., Henikoff J.G. Genomics 19:97-107(1994). 

[ 3] van Beilen J.B., Eggink G., Enequist H., Bos R., Witholt B. Mol. Microbiol. 6:3121- 
3136(1992). 

[ 4] Cheng LP., Poulton J.E. Plant Cell Physiol. 34:1139-1143(1993). 
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223. (GMP_synt_C) 

Glutamine amidotransferases class-I active site 

A large group of biosynthetic enzymes are able to catalyze the removal of the ammonia group 
from glutamine and then to transfer this group to a substrate to form a new carbon-nitrogen 
group. This catalytic activity is knovv^n as glutamine amidotransferase (GATase) (EC 2.4.2.-) 
[1]. The GATase domain exists either as a separate polypeptidic subunit or as part of a larger 
polypeptide fused in different ways to a synthase domain. On the basis of sequence 
similarities two classes of GATase domains have been identified [2,3]: class-I (also known as 
trpG-type) and class-II (also known as purF-type). Class-I GATase domains have been found 
in the following enzymes: 

- The second component of anthranilate synthase (AS) (EC 4.1.3.27) [4]. AS catalyzes the 
biosynthesis of anthranilate from chorismate and glutamine. AS is generally a dimeric 
enzyme: the first component can synthesize anthranilate using ammonia rather than 
glutamine, whereas component II provides the GATase activity. In some bacteria and in fungi 
the GATase component of AS is part of a multifunctional protein that also catalyzes other 
steps of the biosynthesis of tryptophan. 

- The second component of 4-amino-4-deoxychorismate (ADC) synthase (EC 4.1.3. -), a 
dimeric prokaryotic enzyme that function in the pathway that catalyzes the biosynthesis of 
para-aminobenzoate (PABA) from chorismate and glutamine. The second component (gene 
pabA) provides the GATase activity [4]. 

- CTP synthase (EC 6.3.4.2). CTP synthase catalyzes the final reaction in the biosynthesis of 
pyrimidine, the ATP-dependent formation of CTP from UTP and glutamine. CTP synthase is 
a single chain enzyme that contains two distinct domains; the GATase domain is in the C- 
terminal section [2], 

- GMP synthase (glutamine-hydrolyzing) (EC 6.3.5.2). GMP synthase catalyzes the ATP- 
dependent formation of GMP from xanthosine 5'-phosphate and glutamine. GMP synthase is 
a single chain enzyme that contains two distinct domains; the GATase domain is in the N- 
terminal section [5]. 
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- Glutamine-dependent carbamoyl-phosphate synthase (EC 6.3.5.5) (GD-CPSase); an 
enzyme involved in both arginine and pyrimidine biosynthesis and which catalyzes the ATP- 
dependent formation of carbamoyl phosphate from glutamine and carbon dioxide. In bacteria 
GD-CPSase is composed of two subunits: the large chain (gene carB) provides the CPSase 
activity, while the small chain (gene carA) provides the GATase activity. In yeast the 
enzyme involved in arginine biosynthesis is also composed of two subunits: CPAl (GATase), 
and CPA2 (CPSase). In most eukaryotes, the first three steps of pyrimidine biosynthesis are 
catalyzed by a large multifunctional enzyme (called URA2 in yeast, rudimentary in 
Drosophila, and CAD in mammals). The GATase domain is located at the N-terminal 
extremity of this polyprotein [6]. 

- Phosphoribosylformylglycinamidine synthase II (EC 6.3.5.3), an enzyme that catalyzes the 
fourth step in the de novo biosynthesis of purines. In some species of bacteria, FGAM 
synthase II is composed of two subunits: a small chain (gene purQ) which provides the 
GATase activity and a large chain (gene purL) which provides the aminator activity. 

- The histidine amidotransferase hisH, an enzyme that catalyzes the fifth step in the 
biosynthesis of histidine in prokaryotes. 

In the second component of AS a cysteine has been shown [7] to be essential for the 
amidotransferase activity. The sequence around this residue is well conserved in all the 
above GATase domains and can be used as a signature pattern for class-I GATase. 

Consensus pattern[PAS]-[LIVMFYT]-[LIVMFY]-G-[LIVMFY]-C-[LIVMFYN]-G-x- 
[QEH]- x-[LIVMFA] [C is the active site residue] Sequences known to belong to this class 
detected by the pattern ALL, except for 6 sequences. 

Note: in the first position of the pattern Pro is found in all cases except in the slime mold GD- 
CPSase where it is replaced by Ala. 

[ 1] Buchanan J.M. Adv. Enzymol. 39:91-183(1973). 

[ 2] Weng M., Zalkin H. J. Bacteriol. 169:3023-3028(1987). 

[ 3] Nyunoya H., Lusty C.J. J. Biol. Chem. 259:9790-9798(1984). 

[ 4] Crawford LP. Annu. Rev. Microbiol. 43:567-600(1989). 



Reference No. 



2750-942P 



242 

[ 5] Zalkin H., Argos P., Narayana S.V.L., Tiedeman A.A., Smith J.M. J. Biol. Chem. 
260:3350-3354(1985). 

[ 6] Davidson J.N., Chen K.C., Jamison R.S., Musmanno L.A., Kern C.B. BioEssays 15:157- 
164(1993). 

[ 7] Tso J.Y., Hermodson M.A., Zalkin H. J. Biol. Chem. 255:1451-1457(1980). 
224. Glutathione peroxidases signatures (GSHPx) 

Glutathione peroxidase (EC 1.11.1.9 ) (GSHPx) [1,2] is an enzyme that catalyzes the 
reduction of hydroxyperoxides by glutathione. Its main function is to protect against the 
damaging effect of endogenously formed hydroxyperoxides. In higher vertebrates at least 
four forms of GSHPx are known to exist: a ubiquitous cytosolic form (GSHPx-1), a 
gastrointestinal cytosolic for (GSHPx-GI) [3], a plasma secreted form (GSHPx-P) [4], and a 
epididymal secretory form (GSHPx-EP). In addition to these characterized forms, the 
sequence of a protein of unknown function [5] has been shown to be evolutionary related to 
those of GSHPx's. In filarial nematode parasites such as Brugia pahangi the major soluble 
cuticular protein, known as gp29, is a secreted GSHPx which could provide a mechanism of 
resistance to the immune reaction of the mammalian host by neutralizing the products of the 
oxidative burst of leukocytes [6] .Escherichia coli protein btuE, a periplasmic protein involved 
in the transport of vitamin B12, is also evolutionary related to GSHPx's; the significance of 
this relationship is not yet clear. Selenium, in the form of selenocysteine [7] is part of the 
catalytic site of GSHPx. The sequence around the selenocysteine residue is moderately well 
conserved in GSHPx's and the related proteins and can be used as a signature pattern. As a 
second signature for this family of proteins a highly conserved octapeptide located in the 
central section of these proteins was selected. 

Consensus pattern: [GN]-[RKHNFYC]-x-[LIVMFC]-[LIVMF](2)-x-N-[VT]-x-[STCl-x-C- 
[GA]-x-T [C is the active site selenocysteine residue] 
Consensus pattern: [LIV]-[AGD]-F-P-[CS]-[NG]-Q- 

[ 1] MannervikB. Meth. Enzymol. 113:490-495(1985). 

[ 2] Mullenbach G.T., Tabrizi A., Irvine B.D., Bell G.I., Tainer J.A., Hallewell R.A. Protein 
Eng. 2:239-246(1988). 
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[ 3] Chu F.F., Doroshow J.H., Esworthy R.S. J. Biol. Chem. 268:2571-2576(1993). 

[ 4] Takahashi K., Akasaka M., Yamamoto Y., Kobayashi C, Mizoguchi J., Koyama J. J. 

Biochem. 108:145-148(1990). 

[ 5] Dunn D.K., Howells D.D., Richardson J., Goldfarb P.S. Nucleic Acids Res. 17:6390- 
6390(1989). 

[ 6] Cookson E., Blaxter M.L., Selkirk M.E. Proc. Natl. Acad. Sci. U.S.A. 89:5837- 
5841(1992). 

[ 7] Stadtman T.C. Annu. Rev. Biochem. 59:111-127(1990). 

225. (GST) 

Glutathione S -transferases 

Function: conjugation of reduced glutathione to a variety of targets. Also included in 
the alignment, but are not GSTs S-crystallins from squid. Similarity to GST was previously 
noted. Eukaryotic elongation factors 1-gamma. Not known to have GST activity; similarity 
not previously recognized. Supported by HMM and manual alignment inspection. HSP26 
family of stress-related proteins, including auxin-regulated proteins in plants and stringent 
starvation proteins in E. coli. Not known to have GST activity. Similarity not previously 
recognized. Supported by HMM and manual alignment inspection. Alignment spans entire 
protein. 

226. GTPl/OBG family signature 

A widespread family of GTP -binding proteins has been recently characterized [1,2]. This 
family currently includes: - Mouse and Xenopus protein DRG. - Human protein DRG2. - 
Drosophila protein 128up. - Fission yeast protein gtpl. - A Halobacterium cutirubrum 
hypothetical protein in a ribosomal protein gene cluster. - Bacillus subtilis protein obg. Obg 
has been experimentally shown to bind GTP. - Escherichia coli hypothetical protein yhbZ. - 
Haemophilus influenzae hypothetical protein HI0877. - Mycoplasma genitalium hypothetical 
protein MG384. - Yeast hypothetical protein YAL036c (FUNll). - Yeast hypothetical 
protein YGR173w. - Caenorhabditis elegans hypothetical protein C02F5.3.The function of • 
the proteins that belong to this family is not yet known. They are polypeptides of about 40 to 
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48 Kd which contain the five small sequence elements characteristic of GTP -binding proteins 
[3]. As a signature pattern the region that correspond to the ATP/GTP B motif (also called G- 
3 inGTP-binding proteins) was selected. 

Consensus pattern: D-[LIVM]-P-G-[LIVM](2)-[DEY]-[GN]-A-x(2)-G-x-G - 

[ 1] Sazuka T., Tomooka Y., Ikawa Y., Noda M., Kumar S. Biochem. Biophys. Res. 
Commun. 189:363-370(1992). 

[ 2] Hudson J.D., Young P.G. Gene 125:191-193(1993). 

[ 3] Bourne H.R., Sanders D.A., McCormick F. Nature 349:117-127(1991). 

227. (GTP_EFTU1) 
ATP/GTP-binding site motif A (P-loop) 

From sequence comparisons and crystallographic data analysis it has been shown 
[1,2,3,4,5,6] that an appreciable proportion of proteins that bind ATP or GTP share a number 
of more or less conserved sequence motifs. The best conserved of these motifs is a glycine- 
rich region, which typically forms a flexible loop between a beta-strand and an alpha-helix. 
This loop interacts with one of the phosphate groups of the nucleotide. This sequence motif is 
generally referred to as the 'A' consensus sequence [1] or the T-loop' [5].There are numerous 
ATP- or GTP-binding proteins in which the P-loop is found. Listed below are a number of 
protein families for which the relevance of the presence of such motif has been noted: - ATP 
synthase alpha and beta subunits (see <PDOC00137>). - Myosin heavy chains. - Kinesin 
heavy chains and kinesin-like proteins (see <PDOC00343>). - Dynamins and dynamin-like 
proteins (see < PDOC00362 >>. - Guanylate kinase (see <PDOC0067Q>). - Thymidine kinase 
(see <PDOC00524>)- - Thymidylate kinase (see <PDOC01034>). - Shikimate kinase (see 
<PDOC00868>). - Nitrogenase iron protein family (nifH/frxC) (see <PDOC00580>)- - ATP- 
binding proteins involved in 'active transport' (ABC transporters) [7] (see < PDOC00185 >). - 
DNA and RNA helicases [8,9,10]. - GTP-binding elongation factors (EF-Tu, EF-lalpha, EF- 
G, EF-2, etc.). - Ras family of GTP-binding proteins (Ras, Rho, Rab, Ral, Yptl, SEC4, etc.). 
- Nuclear protein ran (see <PDOC00859>). - ADP-ribosylation factors family (see 
<PDOC00781>). - Bacterial dnaA protein (see < PDOC00771 >). - Bacterial recA protein (see 
<PDOC0013i>). - Bacterial recF protein (see <PDOC00539>). - Guanine nucleotide-binding 
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proteins alpha subunits (Gi, Gs, Gt, GO, etc.). - DNA mismatch repair proteins mutS family 
(See < PDOC00388 >). - Bacterial type II secretion system protein E (see <PDOC00567>).Not 
all ATP- or GTP-binding proteins are picked-up by this motif. A number of proteins escape 
detection because the structure of their ATP-binding site is completely different from that of 
the P-loop. Examples of such proteins are the E1-E2 ATPases or the glycolytic kinases. In 
other ATP- or GTP-binding proteins the flexible loop exists in a slightly different form; this 
is the case for tubulins or protein kinases. A special mention must be reserved for adenylate 
kinase, in which there is a single deviation from the P-loop pattern: in the last position Gly is 
found instead of Ser or Thr. 

-Consensus pattern: [AG]-x(4)-G-K-[ST]- 

[ 1] Walker J.E., Saraste M., Runswick M.J., Gay N.J. EMBO J. 1:945-951(1982). 
[ 2] Moller W., Amons R. FEES Lett. 186:1-7(1985). 

[ 3] Fry D.C., Kuby S.A., Mildvan A.S. Proc. Natl. Acad. Sci. U.S.A. 83:907-911(1986). 
[ 4] Dever T.E., Glynias M.J., Merrick W.C. Proc. Natl. Acad. Sci. U.S.A. 84:1814- 
1818(1987). 

[ 5] Saraste M., Sibbald P.R., Wittinghofer A. Trends Biochem. Sci. 15:430-434(1990). 
[ 6] Koonin E.V. J. Mol. Biol. 229:1165-1174(1993). 

[ 7] Higgins C.F., Hyde S.C., Mimmack M.M., Gileadi U., Gill D.R., Gallagher M.P. J. 
Bioenerg. Biomembr. 22:571-592(1990). 

[ 8] Hodgman T.C. Nature 333:22-23(1988) and Nature 333:578-578(1988) (Errata). 

[ 9] Linder P., Lasko P., Ashburner M., Leroy P., Nielsen P.J., Nishi K., Schnier J., Sloniraski 

P.P. Nature 337:121-122(1989). 

[10] Gorbalenya A.E., Koonin E.V., Donchenko A.P., Blinov V.M. Nucleic Acids Res. 
17:4713-4730(1989). 

GTP-binding elongation factors signature (GTP_EFTU2) 

Elongation factors [1,2] are proteins catalyzing the elongation of peptide chains in protein 
biosynthesis. In both prokaryotes and eukaryotes, there are three distinct types of elongation 

factors, as described in the following table: -— 

Eukaryotes Prokaryotes Function 

EE- 1 alpha EF-Tu Binds GTP and an aminoacyl-tRNA; delivers the latter to 
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the A site of ribosomes. EF-lbeta EF-Ts Interacts with EF-la/EF-Tu to displace GDP and 
thus allows the regeneration of GTP-EF-la. EF-2 EF-G Binds GTP and peptidyl-tRNA and 

translocates the latter from the A site to the P site. 

The GTP -binding elongation factor family also includes the following 

proteins: - Eukaryotic peptide chain release factor GTP -binding subunits [3]. These proteins 
interact with release factors that bind to ribosomes that have encountered a stop codon at their 
decoding site and help them to induce release of the nascent polypeptide. The yeast protein 
was known as SUP2 (and also as SUP35, SUF12 or GSTl) and the human homolog as 
GSTl-Hs. - Prokaryotic peptide chain release factor 3 (RF-3) (gene prfC). RF-3 is a class-II 
RF, a GTP-binding protein that interacts with class I RFs (see <PDOC00607>) and enhance 
their activity [4]. - Prokaryotic GTP-binding protein lepA and its homolog in yeast (gene 
GUFl) and in Caenorhabditis elegans (ZK1236.1). - Yeast HBSl [5]. - Rat statin SI [6], a 
protein of unknown function which is highly similar to EF-lalpha. - Prokaryotic 
selenocysteine-specific elongation factor selB [7], which seems to replace EF-Tu for the 
insertion of selenocysteine directed by the UGA codon. - The tetracycline resistance proteins 
tetM/tetO [8,9] from various bacteria such as Campylobacter jejuni, Enterococcus faecalis. 
Streptococcus mutans and Ureaplasma urealyticum. Tetracycline binds to the prokaryotic 
ribosomal 30S subunit and inhibits binding of aminoacyl-tRNAs. These proteins abolish the 
inhibitory effect of tetracycline on protein synthesis. - Rhizobium nodulation protein nodQ 
[10]. - Escherichia coli hypothetical protein yihK [11]. In EF-1 -alpha, a specific region has 
been shown [12] to be involved in a conformational change mediated by the hydrolysis of 
GTP to GDP. This region is conserved in both EF-lalpha/EF-Tu as well as EF-2/EF-G and 
thus seems typical for GTP-dependent proteins which bind non-initiator tRNAs to the 
ribosome. The pattern developed for this family of proteins include that conserved region. 

Consensus pattern: D-[KRSTGANQFYW]-x(3)-E-[KRAQ]-x-[RKQD]-[GC]-[IVMK]-[ST]- 
[IV]-x(2)-[GSTACKRNQ]- 

[1] Concise Encyclopedia Biochemistry, Second Edition, Walter de Gruyter, Berlin New- 
York (1988). 

[ 2] Moldave K. Annu. Rev. Biochem. 54:1109-1149(1985). 
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[ 3] Stansfield I., Jones K.M., Kushnirov V.V., Dagkesamanskaya A.R., Poznyakovski A.I., 
Paushkin S.V., Nierras C.R., Cox B.S., Ter-Avanesyan M.D., Tuite M.F. EMBO J. 14:4365- 
4373(1995). 

[ 4] Grentzmann G., Brechemier-Baey D., Heurgue-Hamard V., Buckingham R.H. J. Biol. 
Chem. 270:10595-10600ri995V 

[ 5] Nelson R.J., Ziegelhoffer T., Nicolet C, Werner- Washburne M., Craig E.A. Cell 71:97- 
105(19921 

[ 6] Ann D.K., Moutsatsos I.K., Nakamura T., Lin H.H., Mao P.-L., Lee M.-J., Chin S., Liem 

R.K.H., Wang E. J. Biol. Chem. 266:10429-10437(1991). 

[ 7] Forchammer K., Leinfeldr W., Bock A. Nature 342:453-456(1989). 

[ 8] Manavathu E.K., Hiratsuka K., Taylor D.E. Gene 62:17-26(1988). 

[ 9] Leblanc D.J., Lee L.N., Titmas B.M., Smith C.J., Tenover F.C. J. Bacteriol. 170:3618- 

3626(1988). 

[10] Cervantes E., Sharma S.B., Maillet P., Vasse J., Truchet G., Rosenberg C. Mol. 
Microbiol. 3:745-755(1989). 

[11] Plunkett G. Ill, Burland V.D., Daniels D.L., Blattner F.R. Nucleic Acids Res. 21:3391- 
3398(1993). 

[12] Moller W., Schipper A., Amons R. Biochimie 69:983-989(1987). 

228. GTP cyclohydrolase II. 

GTP cyclohydrolase II catalyses the first committed step in the biosynthesis of riboflavin. 

[1] Richter G, Ritz H, Katzenmeier G, Volk R, Kohnle A, Lottspeich F, AUendorf D, Bacher 
A, J Bacteriol 1993;175:4045-4051. 

229. Galactose-l-phosphate uridyl transferase signatures (GalP_UDP_transf) 
Galactose-1 -phosphate uridyl transferase (EC 2.7.7.10 ) (galT) catalyzes the transfer of an 
uridyldiphosphate group on galactose (or glucose) 1-phosphate. During the reaction, the 
uridyl moiety links to a histidine residue. In the Escherichia coli enzyme, it has been shown 
[1] that two histidine residues separated by a single proline residue are essential for enzyme 
activity. On the basis of sequence similarities, two apparently unrelated families seem to 
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exist. Class-I enzymes are found in eukaryotes as well as some bacteria such as Escherichia 
coli or Streptomyces lividans, while class-II enzymes have been found so far only in bacteria 
such as Bacillus subtilis or Lactobacillus helveticus [2]. Signature patterns for both families 
were developed. For class-I enzymes the signature is based on the active site residues. For 
5 class-II enzymes a region which also includes two conserved histidines was chosen. 

Consensus pattern: F-E-N-[RK]-G-x(3)-G-x(4)-H-P-H-x-Q [The two H's are the active site 
residues] - 

Consensus pattern: D-L-P-I-V-G-G-[ST]-[LIVM](2)-[SA]-H-[DEN]-H-[FY]-Q-G-G - 
1 0 Note: class-I enzymes are structurally related to the HIT family of proteins (see 
< PDOC00694 

[ 1] Reichardt J.K.V., Berg P. Nucleic Acids Res. 16:9017-9026(1988). 
[ 2] Mollet B., Pilloud N. J. Bacteriol. 173:4464-4473(1991). 

15 

230. Gamma-thionins family signature 

The following small plant proteins are evolutionary related: 

Gamma-thionins from wheat endosperm (gamma-purothionins) and barley 
2 0 (gamma- hordothionins) which are toxic to animal cells and inhibit protein 

synthesis in cell free systems [1]. 
A flower-specific thionin (EST) from tobacco [2]. 

Antifungal proteins (AFP) from the seeds of Brassicaceae species such as radish, 
mustard, turnip and Arabidopsis thaliana [3]. 
2 5 - Inhibitors of insect alpha-am ylases from sorghum [4]. 

- Probable protease inhibitor P322 from potato. 
A germination-related protein from cowpea [5]. 

Anther-specific protein SE18 from sunflower [6]. SE18 is a protein that contains a 
gamma-thionin domain at its N-terminus and a proline-rich C- terminal domain. 
30 - Soybean sulfur-rich protein SE60 [7]. 

Vicia faba antibacterial peptides fabatin-1 and -2. 
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In their mature form, these proteins generally consist of about 45 to 50amino-acid 
residues. As shown in the following schematic representation, these peptides contain eight 
conserved cysteines involved in disulfide bonds. 
+ + I + + I I I I I 

xxCxxxxxxxxxxCxxxxxCxxxCxxxxxxxxxCxxxxxxCxCxxxC *******************|***| | | 



'C: conserved cysteine involved in a disulfide bond. 
'*': position of the pattern. 

Consensus pattern: [KRG]-x-C-x(3)-[SV]-x(2)-[FYWH]-x-[GF]-x-C-x(5)-C-x(3)-C [The 
four C's are involved in disulfide bonds]- 

[1] Bruix M., Jimenez M.A., Santoro J., Gonzalez C, Colilla F.J., Mendez E., Rico M. 
Biochemistry 32:715-724(1993). 

[2] Gu Q., Kawata E.E., Morse M.-J., Wu H.-M., Cheung A.Y. Mol. Gen. Genet. 234:89- 
96(1992). 

[3] Terras F.R.G., Torrekens S., van Leuven P., Osborn R.W., Vanderleyden J., Cammue 

B.P.A., Broekaert W.F. FEBS Lett. 316:233-240(1993). 

[4] Bloch C. Jr., Richardson M. FEBS Lett. 279:101-104(1991). 

[5] Ishibashi N., Yamauchi D., Miniamikawa T. Plant Mol. Biol. 15:59-64(1990). 

[7] Choi Y., Choi Y.D., Lee J.S. Plant Physiol. 101:699-700(1993). 

231. Gelsolin. Gelsolin repeat. Number of members: 170 

[l]Medline: 97433077. The crystal structure of plasma gelsolin: implications for actin 
severing, capping, and nucleation. Burtnick LD, Koepf EK, Grimes J, Jones EY, Stuart DI, 
McLaughlin PJ, Robinson RC; Cell 1997;90:661-670. 

232. Germin family signature 

Germins [1] are a family of homopentameric cereal glycoproteins expressed during 
germination which may play a role in altering the properties of cell walls during germinative 
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growth. It has been shown that wheat and barleygermins act as oxalate oxidases (EC 1.2.3.4 ), 
an enzyme that catalyzes the oxidative degradation of oxalate to carbonate and hydrogen 
peroxide. Germins are highly similar to: - Germin-like proteins from various plants such as 
rape, violet or white mustard. - Slime mold spherulins la and lb which are proteins that 
5 accumulate specifically during spherulation, a process induced by various forms of 

environmental stress which leads to encystment and dormancy. As a signature pattern the best 
conserved region was selected: a decapeptide located in the central section of these proteins. 

Consensus pattern: G-x(4)-H-x-H-P-x-A-x-E-[LIVM]- 

10 

[ 1] Lane B.G. FASEB J. 8:294-301(1994). 

233. (GlutR) 
1 5 Glutamyl -tRNA reductase signature 

Delta-aminolevulinic acid (ALA) is the obligatory precursor for the synthesis of all 
tetrapyrroles including porphyrin derivatives such as chlorophyll and heme. ALA can be 
synthesized via two different pathways: the Shemin (or C4) pathway which involves the 
2 0 single step condensation of succinyl-CoA and glycine and which is catalyzed by ALA 
synthase (EC 2.3.1.37) and via the CSpathway from the five-carbon skeleton of glutamate. 
The C5 pathway operates in the chloroplast of plants and algae, in cyanobacteria, in some 
eubacteria and in archaebacteria. 

2 5 The initial step in the C5 pathway is carried out by glutamyl-tRNA reductase (GluTR) [1] 

which catalyzes the NADP-dependent conversion of glutamate- tRNA(Glu) to 
glutamate- 1 -semialdehyde (GS A) with the concomitant release of tRNA(Glu) which can 
then be recharged with glutamate by glutamyl-tRNA synthetase. 

3 0 GluTR is a protein of about 50 Kd (467 to 550 residues) which contains a few conserved 

region. The best conserved region is located in positions 99 to 122 in the sequence of known 
GluTR. This region seems important for the activity of the enzyme. We have developed a 
signature pattern from that conserved region. 
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Consensus patternH-[LIVM]-x(2)-[LIVM]-[GSTAC](3)-[LIVM]-[DEQ]-S-[LIVMA]- 
[LIVM](2)-[GF]-E-x-[EQR]-[IV]-[LIT]-[STAG]-Q-[LIVM]-[KR] Sequences known to 
belong to this class detected by the pattern ALL. 

5 

[ 1] Jahn D., Verkamp E., Soell D. Trends Biochem. Sci. 17:215-218(1992). 

234. (Glycoprotease) 
1 0 Glycoprotease family signature (aka Peptidase_M22) 

Glycoprotease (GCP) (EC 3.4.24.57) [1], or o-syaloglycoprotein endopeptidase, is a 
metalloprotease secreted by Pasteurella haemolytica which specifically cleaves O- 
sialoglycoproteins such as glycophorin A. The sequence of GCP is highly similar to the 
15 following uncharacterized proteins: 

- Escherichia coli hypothetical protein ygjD (ORF-X). 

- Bacillus subtilis hypothetical protein ydiE. 

- Mycobacterium leprae hypothetical protein U229E. 

2 0 - Mycobacterium tuberculosis hypothetical protein MtCY78.10. 

- Synechocystis strain PCC 6803 hypothetical protein slr0807. 

- Methanococcus jannaschii hypothetical protein MJ1130. 

- Haloarcula marismortui hypothetical protein in HSH 3' region. 

- Yeast hypothetical protein YKR038c. 

2 5 - Yeast hypothetical protein QRI7. 

One of the conserved regions contains two conserved histidines. It is possible that this region 
is involved in coordinating a metal ion such as zinc. 

3 0 Consensus pattern[KR]-[GSAT]-x(4)-[FYWLH]-[DQNGK]-x-P-x-[LIVMFY]-x(3)-H- x(2)- 

[AG]-H-[LIVM] Sequences known to belong to this class detected by the pattern ALL. 
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Note: these proteins belong to family M22 in the classification of peptidases [2,E1]- 

n 

[ 1] Abdullah K.M., Lo R.Y.C., Mellors A. J. Bacteriol. 173:5597-5603(1991). 

□ 

[ 2] Rawlings N.D., Barrett A.J. Meth. Enzymol. 248:183-228(1995). 
235. (Glucosamine__iso) 

Glucosamine/galactosamine-6-phosphate isomerases signature 

Glucosamine-6-phosphate isomerase (EC 5.3.1.10 ) (or Glc-6-P deaminase) is the enzyme 
responsible for the conversion of glucosamine 6-phosphate into fructose6 phosphate [1]. It is 
the last specific step in the pathway for N-acetylglucosamine (GlcNAC) utilization in bacteria 
such as Escherichia coli (gene nagB) or in fungi such as Candida albicans (gene NAGl).Glc- 
6-P isomerase is evolutionary related to: - A putative Escherichia coli galactosamine-6- 
phosphate isomerase (gene agal) [2]. - Escherichia coli hypothetical protein yieK. - Bacillus 
subtilis hypothetical protein ybfT. As a signature pattern a conserved region located in the 
central part of these enzymes was selected. This region contains a conserved histidine which 
has been shown [1], in nagB, to be important for the pyranose ring-opening step of the 
catalytic mechanism 

Consensus pattern: [LIVM]-x(3)-G-x-[LIT]-x-[LIV]-x-[LIVM]-x-G-[LIVM]-G-x- [DEN]-G- 
H- 

[ 1] Oliva G., Pontes M.R.M., Garratt R.C., Altamirano M.M., Calcagno M.L., Horjales E. 
Structure 3:1323-1332(1995). 

[ 2] Reizer J., Ramseier T.M., Reizer A., Charbit A., Saier M.H. Jr. Microbiology 142:231- 
250(1996). 

236. Pneumovinis attachment glycoprotein G (glycoprotein G) 

This family includes attachment proteins from respiratory synctial virus. Glycoprotein 
G has not been shown to have any neuraminidase or hemagglutinin activity (Swiss-Prot). The 
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amino terminus is thought to be cytoplasmic, and the carboxyl terminus extracellular. The 
extracellular region contains four completely conserved cysteine residues. 

[1] Johnson PR, Spriggs MK, Olmsted RA, Collins PL, Proc Natl Acad Sci U S A 
1987;84:5625-5629. 

5 

237. Glycosyl transferases group 1 

Mutations in this domain of Swiss:P37287 lead to disease (Paroxysmal Nocturnal 
haemoglobinuria). Members of this family transfer activated sugars to a variety of substrates, 
1 0 including glycogen, Fructose-6-phosphate and lipopolysaccharides. Members of this family 
transfer UDP, ADP, GDP or CMP linked sugars. The eukaryotic glycogen synthases may be 
distant members of this family. 

15 238. Glycosyl transferases (Glycos_transf_2) 

Diverse family, transferring sugar from UDP-glucose, UDP-N-acetyl-galactosamine, 
GDP-mannose or CDP-abequose, to a range of substrates including cellulose, dolichol 
phosphate and teichoic acids. 

20 

239. (Glucos_transf_3) 

Thymidine and pyrimidine-nucleoside phosphorylases signature 

Thymidine phosphorylase (EC 2.4.2.4) catalyzes the reversible phosphorolysis of 
25 thymidine, deoxyuridine and their analogues to their respective bases and 2-deoxyribose 1- 
phosphate. This enzyme regulates the availability of thymidine and is therefore essential to 
nucleic acid metabolism. 

In Escherichia coli (gene deoA), the enzyme is a dimer of identical subunits of about 48 
3 0 Kd [1]. In humans it was first identified as platelet-derived endothelial cell growth factor 
(PD-ECGF) [El] before being recognized [2] as thymidine phosphorylase. 
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Bacterial pyrimidine-nucleoside phosphorylase (EC 2.4.2.2) (gene pdp) [3] is an enzyme 
evolutionary and structurally related to thymidine phosphorylase. 

A a well conserved region of 19 residues located in the N-terminal part of these proteins 
signature pattern for these enzymes was selected. 

Consensus patternS-[GS]-R-[GA]-[LIV]-x(2)-[TA]-[GA]-G-T-x-D-x-[LIV]-E Sequences 
known to belong to this class detected by the pattern ALL. 

[ 1] Walter M.R., Coolc W.J., Cole L.B., Short S.A., Koszalka G.W., Krenitsky T.A., Ealick 
S.E. J. Biol. Chem. 265:14016-14022(1990). 

[ 2] Furukawa T., Yoshimura A., Sumizawa T., Haraguchi M., Akiyama S.-L, Fukui K., 
Yamada Y. Nature 356:668-668(1992). 

[ 3] Saxild H.H., Andersen L.N., Hammer K. J. Bacteriol. 178:424-434(1996). 

240. Glycos_transf_4. Glycosyl transferase. Number of members: 44. 

[1] Medline: 95252686. A family of UDP-GlcNAc/MurNAc: polyisoprenol-P 
GlcNAc/MurNAc-l-P transferases. Lehrman MA; Glycobiology 1994;4:768-771. 

241. Glycosyl hydrolases family 15. 21 members. 

242. Glycosyl hydrolases family 16 signature 

It has been shown [1] that the following glycosyl hydrolases can be classified into a single 
family on the basis of sequence similarities: - Bacterial beta-l,3-l,4-glucanases, or 
lichenases, (EC 3.2.1.73) mainly from Bacillus but also from Clostridium thermocellum 
(gene licB), Fibrobacter succinogenes and Rhodothermus marinus (gene bglA). - Bacillus 
circulans beta-l,3-glucanase Al (EC 3.2.1.39 ) (gene glcA). - Lamarinase (EC 3.2.1.6) from 
Clostridium thermocellum (gene laml). - Streptomyces coelicolor agarase (EC 3.2.1.81) 
(gene dagA). - Alteromonas carrageenovora kappa-carrageenase (EC 3.2.1.83) (gene 
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cgkA).Two closely clustered conserved glutamates have been shown [2] to be involved in the 
catalytic activity of Bacillus licheniformis lichenase. The region was used that contains these 
residues as a signature pattern. 

Consensus pattern: E-[LIV]-D-[LIV]-x(0,l)-E-x(2)-[GQ]-[KRNF]-x-[PSTA] [The two E's 
are active site residues] - 

[ 1] Henrissat B. Biochem. J. 280:309-316(1991). 

[ 2] Juncosa M., Pons J., Dot T., Querol E., Planas A. J. Biol. Chem. 269:14530- 
14535(1994). 

243. Glycosyl hydrolases family 17 signature 

It has been shown [1,2] that the following glycosyl hydrolases can be classified into a single 
family on the basis of sequence similarities: - Glucan endo-l,3-beta-glucosidases (EC 
3.2.1.39 ) (endo-(l->3)-beta- glucanase) from various plants. This enzyme may be involved in 
the defense of plants against pathogens through its ability to degrade fungal cell wall 
polysaccharides. - Glucan 1,3-beta-glucosidase fEC 3.2.1.58 ) (exo-(l->3)-beta-glucanase) 
from yeast (gene BGL2). This enzyme may play a role in cell expansion during growth, in 
cell-cell fusion during mating, and in spore release during sporulation. - Lichenases (EC 
3.2.1.73 ) (endo-(l->3,l->4)-beta-glucanase) from various plants. The best conserved region 
in the sequence of these enzymes is located in their central section. This region contains a 
conserved tryptophan residue which could be involved in the interaction with the glucan 
substrates [2] and it also contains a conserved glutamate which has been shown [3] to act as 
the nucleophile in the catalytic mechanism, this region was used as a signature pattern. 

Consensus pattern: [LIVM]-x-[LIVMFYWA](3)-[STAG]-E-[STA]-G-W-P-[STN]-x- 
[SAGQ] [E is an active site residue] - 

[ 1] Henrissat B. Biochem. J. 280:309-316(1991). 

[ 2] Ori N., Sessa G., Lotan T., Himmelhoch S., Fluhr R. EMBO J. 9:3429-3436(1990). 
[ 3] Varghese J.N., Garrett T.P.J., Colman P.M., Chen L., Hoj P.J., Fincher G.B. Proc. Nati. 
Acad. Sci. U.S.A. 91:2785-2789(1994). 
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244. Glyoxalase I signatures 

Glyoxalase I (EC 4.4.1 .5 ^ (lactoylglutathione lyase) catalyzes the first step of the glyoxal 
pathway, the transformation of methylglyoxal and glutathioneinto S-lactoylglutathione which 
is then converted by glyoxalase II to lactic acid [1]. Glyoxalase I is an ubiquitous enzyme 
which binds one mole of zinc per subunit. The bacterial and yeast enzymes are monomeric 
while the mammalian one is homodimeric. The sequence of glyoxalase I is well conserved. In 
bacteria and mammals, the enzyme is a protein of about 130 to 180 residues while in fungi it 
is about twice longer. In these organisms the enzyme is built out of the tandem repeat of an 
homologous domain. Two signature patterns for this family were derived. The first one is 
located in the N-terminal region while the second one is located in the central section of the 
protein and contains a conserved histidine that could be implicated in the binding of the zinc 
atom. 

Consensus pattern: [HQ]-[IVT]-x-[LIVFY]-x-[IV]-x(5)-[STA]-x(2)-F-[YM]-x(2,3)- [LMF]- 
G-[LMF]- 

Consensus pattern: G-[NTKQ]-x(0,5)-[GA]-[LVFY]-[GH]-H-[IVF]-[CGA]-x-[STAGLE]- 
x(2)-[DNC]- 

[ 1] Kim N.-S., Umezawa Y., Ohmura S., Kato S. J. Biol. Chem. 268:11217-11221(1993). 

245. (Glypican) 
Glypicans signature 

Glypicans [1,2] are a family of heparan sulfate proteoglycans which are anchored to cell 
membranes by a glycosylphosphatidylinositol (GPI) linkage. Structurally, these proteins 
consist of three separate domains: 

a) A signal sequence; 

b) An extracellular domain of about 500 residues that contains 12 conserved 
cysteines probably involved in disulfide bonds and which also contains the 
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sites of attachment of the heparan sulfate glycosaminoglycan side chains; 
c) A C-terminal hydrophobic region which is post-translationally removed 
after formation of the GPI-anchor. 

The proteins known to belong to this family are: 

- Glypican 1 (GPCl). 

- Glypican 2 (GPC2) or cerebroglycan. 

- Glypican 3 (GPC3) or OCI-5. In man, defects in GPC3 are the cause of a X- 
linked genetic disease, Simpson-Galabi-Behmel syndrome (SGBS). 

- K-glypican. 

- Glypican 5 (GPC5). 

- Drosophiia protein dally. 

The signature pattern that was developed for glypicans is located in the central section of 
the extracellular domain and contains five of the conserved cysteines. 

Consensus patternC-x(2)-C-x-G-[LIVM]-x(4)-P-C-x(2)-[FY]-C-x(2)-[LIVM]-x(2)- G-C [The 
C's are probably involved in a disulfide bonds] Sequences known to belong to this class 
detected by the pattern ALL, except for dally. 

[ 1] Weksberg R., Squire J.A., Templeton D.M. Nat. Genet. 12:225-227(1996). 
[ 2] Watanabe K., Yamada H., Yamaguchi Y. J. Cell Biol. 130:1207-1218(1995). 

246. Granins signatures 

Granins (chromogranins or secretogranins) [1] are a family of acidic proteins present in the 
secretory granules of a wide variety of endocrine and neuro-endocrine cells. The exact 
function(s) of these proteins is not yet known but they seem to be the precursors of 
biologically active peptides and/or they may act as helper proteins in the packaging of peptide 
hormones and neuropeptides. Three members of this family of proteins show some sequence 
similarities: - Chromogranin A (CGA) [2]. CGA is a protein of about 420 residues; it is the 
precursor of the peptide pancreastatin which strongly inhibits glucose- induced insulin release 
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from the pancreas. - Secretogranin 1 (chromogranin B). A sulfated protein of about 600 
residues. - Secretogranin 2 (chromogranin C). A sulfated protein of about 650 residues. Apart 
from their subcellular location and the abundance of acidic residues(Asp and Glu), these 
proteins do not share many structural similarities. Only one short region, located in the C- 
terminal section, is conserved in all these proteins. Chromogranins A and B share a region of 
high similarity in their N-terminal section; this region includes two cysteine residues involved 
in a disulfide bond 

Consensus pattern: [DE]-[SN]-L-[SAN]-x(2)-[DE]-x-E-L- 

Consensus pattern: C-[LIVM](2)-E-[LIVM](2)-S-[DN]-[STA]-L-x-K-x-S-x(3)- [LIVM]- 
[STA]-x-E-C [The two C's are linked by a disulfide bond]- 

[ 1] Huttner W.B., Gerdes H.-H., Rosa P. Trends Biochem. Sci. 16:27-30(1991). 
[ 2] Simon J.-P., Aunis D. Biochem. J. 262:1-13(1989). 

247. grpE protein signature 

In prokaryotes the grpE protein [1] stimulates, jointly with dnaJ, the ATPase activity of the 
dnaK chaperone. It seems to accelerate the release of ADP from dnaK thus allowing dnaK to 
recycle more efficiently. GrpE is a protein of about 22 to 25 Kd. In yeast, an evolutionary 
related mitochondrial protein(gene GRPE) has been shown [2] to associate with the 
mitochondrial hspTOprotein and to thus play a role in the import of proteins from the 
cytoplasm. As a signature pattern, the most conserved region of grpE was selected. It is 
located in the C-terminal section. 

Consensus pattern: [FL]-[DN]-[PHEA]-x(2)-[HM]-x-A-[LIVMTN]-x(16,20)-G-[FY]- x(3)- 
[DEG]-x(2)-[LIVM]-[RI]-x-[SA]-x-V-x-[IV]- 

[ 1] Georgopoulos C, Welch W. Annu. Rev. Cell Biol. 9:601-635(1993). 

[ 2] BoUiger L., Deloche O., Glick B.S., Georgopoulos C, Jenoe P., Kronidou N., Horst M., 

Morishima N., Schatz G. EMBO J. 13:1998-2006(1994). 
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248. Guanylate kinase signature and profile 

Guanylate kinase (EC 2.7.4.8 ) (GK) [1] catalyzes the ATP-dependent phosphorylation of 
GMP into GDP. It is essential for recycling GMP and indirectly, cGMP. In prokaryotes (such 
as Escherichia coli), lower eukaryotes (such as yeast) and in vertebrates, GK is a highly 
conserved monomeric protein of about 200 amino acids. GK has been shown [2,3,4] to be 
structurally similar to the following proteins: - Protein A57R (or SalG2R) from various 
strains of Vaccinia virus. This protein is highly similar to GK, but contains a frameshift 
mutation in the N-terminal section and could therefore be inactive in that virus. The 
following proteins are characterized by the presence in their sequence of one or more copies 
of the DHR domain, a SH3 domain (see < PDOC50002 > as well as a C-terminal GK-like 
domain, these protein are collectively termed MAGUKs (membrane-associated guanylate 
kinase homologs) [5]: - Drosophila lethal(l)discs large-1 tumor suppressor protein (gene 
dlgl). This protein is associated with septate junctions in developing flies and defects in the 
dlgl gene cause neoplastic overgrowth of the imaginal disks. - Mammalian tight junction 
protein Zo-1. - A family of mammalian synaptic proteins that seem to interact with the 
cytoplasmic tail of NMDA receptor subunits. This family currently consist of SAP90/PSD- 
95, CHAPSYN-llO/PSD-93, SAP97/DLG1 and SAP102. - Vertebrate 55 Kd erythrocyte 
membrane protein (p55). p55 is a palmitoylated, membrane-associated protein of unknown 
function. - Caenorhabditis elegans protein lin-2, which may play a structural role in the 
induction of the vulva. - Rat protein CASK. - Human protein DLG2. - Human protein 
DLG3. There is an ATP-binding site (P-loop) in the N-terminal section of GK. This region is 
not conserved in the GK-like domain of the above proteins which are therefore unlikely to be 
kinases. However these proteins retain the residues known, in GK, to be involved in the 
binding of GMP. As a signature pattern a highly conserved region was selected that contains 
two arginine and a tyrosine which are involved in GMP-binding 

Consensus pattern: T-[ST]-R-x(2)-[KR]-x(2)-[DE]-x(2)-G-x(2)-Y-x-[FY]-[LIVMK]- 

[ 1] Stehle T., Schulz G.E. J. Mol. Biol. 224:1127-1141(1992). 
[ 2] Bryant P.J., Woods D.F. Cell 68:621-622a992y 
[ 3] Goebl M.G. Trends Biochem. Sci. 17:99-99(1992). 

[ 4] Zschocke P.D., Schiltz E., Schulz G.E. Eur. J. Biochem. 213:263-269(1993). 
[ 5] Woods D.F., Bryant P.J. Mech. Dev. 44:85-89(1994). 
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249. (Glyco_hydro_35) 

Glycosyl hydrolases family 35 putative active site 

Beta-galactosidases (EC 3.2.1.23) from mammals, fungi, plants and the bacteria 
Xanthomonas manihotis are evolutionary related [1,2]. They belong to family 35 in the 
classification of glycosyl hydrolases [3,E1]. 

Mammalian beta-galactosidase is a lysosomal enzyme (gene GLBl) which cleaves the 
terminal galactose from gangliosides, glycoproteins, and glycosaminoglycans and whose 
deficiency is the cause of the genetic disease Gm(l) gangliosidosis (Morquio disease type B). 

On of the best conserved regions in these enzymes contains a glutamic acid residue which, on 
the basis of similarities with other families of glycosyl hydrolases [4], probably acts as the 
proton donor in the catalytic mechanism. This region wss used as a signature pattern. 

Consensus pattern: G-G-P-[LIVM](2)-x(2)-Q-x-E-N-E-[FY] [The second E is the putative 
active site residue] Sequences known to belong to this class detected by the pattern ALL. 

[ 1] Taron C.H., Benner J.S., Hornstra L.J., Guthrie E.P. Glycobiology 5:603-610(1995). 
[ 2] Carey A.T., Hoh K., Picard S., Wilde R., Tucker G.A., Bird C.R., Schuch W., Seymour 
G.B. Plant Physiol. 108:1099-1107(1995). 
[ 3] Henrissat B., Bairoch A. Biochem. J. 293:781-788(1993). 

[ 4] Henrissat B., Callebaut I., Fabrega S., Lehn P., Mornon J. -P., Davies G. Proc. Natl. Acad. 
Sci. U.S.A. 92:7090-7094(1995). 

250. (Glyco_hydro_16) 

Glycosyl hydrolases family 16 signature 

It has been shown [1] that the following glycosyl hydrolases can be classified into a single 
family on the basis of sequence similarities: 
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- Bacterial beta-l,3-l,4-glucanases, or lichenases, (EC 3.2.1.73) mainly from 
Bacillus but also from Clostridium thermocellum (gene licB), Fibrobacter 
succinogenes and Rhodothermus marinus (gene bglA). 

- Bacillus circulans beta-l,3-glucanase Al (EC 3.2.1.39) (gene glcA). 

- Lamarinase (EC 3.2.1.6) from Clostridium thermocellum (gene laml). 

- Streptomyces coelicolor agarase (EC 3.2.1.81) (gene dagA). 

- Alteromonas carrageenovora kappa-carrageenase (EC 3.2.1.83) (gene cgkA). 

Two closely clustered conserved glutamates have been shown [2] to be involved in the 
catalytic activity of Bacillus licheniformis lichenase. The region that contains these residues 
as a signature pattern was used. 

Consensus pattern E-[LIV]-D-[LIV]-x(0,l)-E-x(2)-[GQ]-[KRNF]-x-[PSTA] [The two E's are 
active site residues] 

[ 1] Henrissat B. Biochem. J. 280:309-316(1991). 

[ 2] Juncosa M., Pons J., Dot T., Querol E., Planas A. J. Biol. Chem. 269:14530- 
14535(1994). 

251. (Glyco_hydro_17) 

Glycosyl hydrolases family 17 signature 

(aka glycosyl_hydro4) 

It has been shown [1,2] that the following glycosyl hydrolases can be classified into a single 
family on the basis of sequence similarities: 

- Glucan endo-l,3-beta-glucosidases (EC 3.2.1.39) (endo-(l->3)-beta-glucanase) from 
various plants. This enzyme may be involved in the defense of plants against pathogens 
through its ability to degrade fungal cell wall polysaccharides. 
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- Glucan 1,3-beta-glucosidase (EC 3.2.1.58) (exo-(l->3)-beta-glucanase) from yeast (gene 
BGL2). This enzyme may play a role in cell expansion during growth, in cell-cell fusion 
during mating, and in spore release during sporulation. 

- Lichenases (EC 3.2.1.73) (endo-(l->3,l->4)-beta-glucanase) from various plants. 

The best conserved region in the sequence of these enzymes is located in their central section. 
This region contains a conserved tryptophan residue which could be involved in the 
interaction with the glucan substrates [2] and it also contains a conserved glutamate which 
has been shown [3] to act as the nucleophile in the catalytic mechanism. This region was used 
as a signature pattern. 

Consensus pattern [LIVM] -x-[LI VMFYWA] (3)- [STAG] -E- [ STA] -G- W-P - [ STN] -x-[S AGQ] 
[E is an active site residue] Sequences known to belong to this class detected by the pattern 
ALL. 

[ 1] Henrissat B. Biochem. J. 280:309-316(1991). 

[ 2] Ori N., Sessa G., Lotan T., Himmelhoch S., Fluhr R. EMBO J. 9:3429-3436(1990). 
[ 3] Varghese J.N., Garrett T.P.J. , Colman P.M., Chen L., Hoj P.J., Fincher G.B. Proc. Natl. 
Acad. Sci. U.S.A. 91:2785-2789(1994). 

252. (Glyco_hydro_3) 

Glycosyl hydrolases family 3 active site 

It has been shown [1,2] that the following glycosyl hydrolases can be, on the basis of 
sequence similarities, classified into a single family: 

- Beta glucosidases (EC 3.2.1.21) from the fungi Aspergillus wentii (A-3), 
Hansenula anomala, Kluyveromyces fragilis, Saccharomycopsis fibuligera, 
(BGLl and BGL2), Schizophyllum commune and Trichoderma reesei (BGLl). 

- Beta glucosidases from the bacteria Agrobacterium tumefaciens (Cbgl), 
Butyrivibrio fibrisolvens (bglA), Clostridium thermocellum (bglB), 
Escherichia coli (bglX), Erwinia chrysanthemi (bgxA) and Ruminococcus 
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albus. 

- Alteromonas strain 0-7 beta-hexosaminidase A (EC 3.2.1.52). 

- Bacillus subtilis hypothetical protein yzbA. 

- Escherichica coli hypothetical protein ycfO and HI0959, the corresponding 
Haemophilus influenzae protein. 

One of the conserved regions in these enzymes is centered on a conserved aspartic acid 
residue which has been shown [3], in Aspergillus wentii beta- glucosidase A3, to be 
implicated in the catalytic mechanism. This region was used as a signature pattern. 

Consensus pattern[LIVM](2)-[KR]-x-[EQK]-x(4)-G-[LIVMFT]-[LIVT]-[LIVMF]- [ST]-D- 
x(2)-[SGADNI] [D is the active site residue] Sequences known to belong to this class 
detected by the patternALL. 

[ 1] Henrissat B. Biochem. J. 280:309-316(1991). 

[ 2] Castle L.A., Smith K.D., Morris R.O. J. Bacteriol. 174:1478-1486(1992). 
[ 3] Bause E., Legler G. Biochim. Biophys. Acta 626:459-465(1980). 

253. (Glyco_hydro_28) 
Polygalacturonase active site (aka PG) 

Polygalacturonase (EC 3.2.1.15) (PG) (pectinase) [1,2] catalyzes the random hydrolysis of 
1,4-alpha-D-galactosiduronic linkages in pectate and other galacturonans. In fruit, 
polygalacturonase plays an important role in cell wall metabolism during ripening. In plant 
bacterial pathogens such as Erwinia carotovora or Pseudomonas solanacearum and fungal 
pathogens such as Aspergillus niger, polygalacturonase is involved in maceration and soft- 
rotting of plant tissue. 

Exo-poly-alpha-D-galacturonosidase (EC 3.2.1.82) (exoPG) [3] hydrolyzes peptic acid from 
the non-reducing end, releasing digalacturonate. 
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Prokaryotic, eukaryotic PG and exoPG share a few regions of sequence similarity. The best 
conserved of these regions was selected. It is centered on a conserved histidine most 
probably involved in the catalytic mechanism [4]. 

Consensus pattern[GSDENKRH]-x(2)-[VMFC]-x(2)-[GS]-H-G-[LIVMAG]-x(l,2)- [LIVM]- 
G-S [H is the putative active site residue] Sequences known to belong to this class detected 
by the pat tern ALL. 

Note: these proteins belong to family 28 in the classification of glycosyl hydrolases [5]. 

[ 1] Ruttowski E., Labitzke R., Khanh N.Q., Loeffler F., Gottschalk M., Jany K.-D. Biochim. 
Biophys. Acta 1087:104-106(1990). 

[ 2] Huang J., Schell M.A. J. Bacterid. 172:3879-3887(1990). 

[ 3] He S.Y., Collmer A. J. Bacteriol. 172:4988-4995(1990). 

[ 4] Bussink H.J.D., Buxton P.P., Visser J. Curr. Genet. 19:467-474(1991). 

[ 5] Henrissat B. Biochem. J. 280:309-316(1991). 

254. (Glyco_hydro_32) 

Glycosyl hydrolases family 32 active site 

It has been shown [1,2] that the following glycosyl hydrolases can be classified into a single 
family on the basis of sequence similarities: 

- Inulinase (EC 3.2.1.7) (or inulase) from the fungi Kluyveromyces marxianus. 

- Beta-fructofuranosidase (EC 3.2.1.26), commonly known as invertase in fungi 
and plants and as sucrase in bacteria (gene sacA or scrB). 

- Raffinose invertase (EC 3.2.1.26) (gene rafD) from Escherichia coli plasmid 
pRSD2. 

- Levanase (EC 3.2.1.65) (gene sacC) from Bacillus subtilis. 
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One of the conserved regions in these enzymes is located in the N-terminal section and 
contains an aspartic acid residue which has been shown [3], in yeast invertase to be important 
for the catalytic mechanism. This region was used as a signature pattern. 

Consensus pattern H-x(2)-P-x(4)-[LIVM]-N-D-P-N-G [D is the active site residue] 
Sequences known to belong to this class detected by the patternALL. 

[ 1] Henrissat B. Biochem. J. 280:309-316(1991). 

[ 2] Gunasekaran P., Karunakaran T., Cami B., Mukundan A.G., Preziosi L., Baratti J. J. 
Bacteriol. 172:6727-6735(1990). 

[ 3] Reddy V.A., Maley F. J. Biol. Chem. 265:10817-10120(1990). 

255. (Glyco_hydro_l) 

Glycosyl hydrolases family 1 signatures 

It has been shown [1 to 4] that the following glycosyl hydrolases can be, on the basis of 
sequence similarities, classified into a single family: 

- Beta-glucosidases (EC 3.2.1.21) from various bacteria such as Agrobacterium 
strain ATCC 21400, Bacillus polymyxa, and Caldocellum saccharolyticum. 

- Two plants (clover) beta-glucosidases (EC 3.2.1.21). 

- Two different beta-galactosidases (EC 3.2.1.23) from the archaebacteria 
Sulfolobus solfataricus (genes bgaS and lacS). 

- 6-phospho-beta-galactosidases (EC 3.2.1.85) from various bacteria such as 
Lactobacillus casei, Lactococcus lactis, and Staphylococcus aureus. 

- 6-phospho-beta-glucosidases (EC 3.2.1.86) from Escherichia coli (genes bglB 
and ascB) and from Erwinia chrysanthemi (gene arbB). 

- Plants myrosinases (EC 3.2.3.1) (sinigrinase) (thioglucosidase). 

- Mammalian lactase-phlorizin hydrolase (LPH) (EC 3.2.1.108 / EC 3.2.1.62). 
LPH, an integral membrane glycoprotein, is the enzyme that splits lactose 

in the small intestine. LPH is a large protein of about 1900 residues which 
contains four tandem repeats of a domain of about 450 residues which is 
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evolutionary related to the above glycosyl hydrolases. 

One of the conserved regions in these enzymes is centered on a conserved glutamic acid 
residue which has been shown [5], in the beta-glucosidase from Agrobacterium, to be directly 
involved in glycosidic bond cleavage by acting as a nucleophile. This region was used as a 
signature pattern. As a second signature pattern we selected a conserved region, found in the 
N-terminal extremity of these enzymes, this region also contains a glutamic acid residue. 

Consensus pattern[LIVMFSTC]-[LIVFYS]-[LIV]-[LlVMST]-E-N-G-[LIVMFAR]- 
[CSAGN] [E is the active site residue] Sequences known to belong to this class detected by 
the patternALL. 

Note: this pattern will pick up the last two domains of LPH; the first two domains, which are 
removed from the LPH precursor by proteolytic processing, have lost the active site 
glutamate and may therefore be inactive [4], 

Consensus patternF-x-[FYWM]-[GSTA]-x-[GSTA]-x-[GSTA](2)-[FYNH]-[NQ]-x-E-x- 
[GSTA] Sequences known to belong to this class detected by the pattern ALL. 

Note: this pattern will pick up the last three domains of LPH. 

[ 1] Henrissat B. Biochem. J. 280:309-316(1991). 

[ 2] Henrissat B. Protein Seq. Data Anal. 4:61-62(1991). 

[ 3] Gonzalez-Candelas L., Ramon D., Polaina J. Gene 95:31-38(1990). 

[ 4] El Hassouni M., Henrissat B., Chippaux M., Barras F. J. Bacteriol. 174:765-777(1992). 

[ 5] Withers S.G., Warren R.A.J., Street LP., Rupitz K., Kempton J.B., Aebersold R. J. Am. 

Chem. Soc. 112:5887-5889(1990). 

256. Glyco_hydro_20 
Glycosyl hydrolase family 20 
Previous Pfam IDs: glycosyl_hydrll; 
Number of members: 33 
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257. (Glyco_hydro_9) 

Glycosyl hydrolases family 9 active sites signatures 
(aka Glycosyl_hydrl2) 

The microbial degradation of cellulose and xylans requires several types of enzymes such as 
endoglucanases (EC 3.2.1.4), cellobiohydrolases (EC 3.2.1.91) (exoglucanases), or xylanases 
(EC 3.2.1.8) [1,2]. Fungi and bacteria produces a spectrum of cellulolytic enzymes 
(cellulases) and xylanases which, on the basis of sequence similarities, can be classified into 
families. One of these families is known as the cellulase family E [3] or as the glycosyl 
hydrolases family 9 [4,E1]. The enzymes which are currently known to belong to this family 
are listed below. 

- Butyrivibrio fibrisolvens cellodextrinase 1 (cedl). 

- Cellulomonas fimi endoglucanases B (cenB) and C (cenC). 

- Clostridium cellulolyticum endoglucanase G (celCCG). 

- Clostridium cellulovorans endoglucanase C (engC). 

- Clostridium stercoararium endoglucanase Z (avicelase I) (celZ). 

- Clostridium thermocellum endoglucanases D (celD), F (celF) and I (cell). 

- Fibrobacter succinogenes endoglucanase A (endA). 

- Pseudomonas fluorescens endoglucanase A (celA). 

- Streptomyces reticuli endoglucanase 1 (cell). 

- Thermomonospora fusca endoglucanase E-4 (celD). 

- Dictyostelium discoideum spore germination specific endoglucanase 270-6. This slime 
mold enzyme may digest the spore cell wall during germination, to release the enclosed 
amoeba. 

- Endoglucanases from plants such as Avocado or French bean. In plants this enzyme may be 
involved the fruit ripening process. 

Two of the most conserved regions in these enzymes are centered on conserved residues 
which have been shown [5,6], in the endoglucanase D from Cellulomonas thermocellum, to 



Reference No. 2750-942P 



268 

be important for the catalytic activity. The first region contains an active site histidine and the 
second region contains tvv^o catalytically important residues: an aspartate and a glutamate. 
Both regions were used as signature patterns. 

Consensus pattern [STV]-x-[LIVMFY]-[STV]-x(2)-G-x-[NKR]-x(4)-[PLIVM]-H-x-R [H is 
an active site residue] Sequences known to belong to this class detected by the pattern ALL, 
except for Cellulomonas fimi cenC and Streptomyces reticuli cell. 

Consensus pattern [FYW]-x-D-x(4)-[FYW]-x(3)-E-x-[STA]-x(3)-N-[STA] [D and E are 
active site residues] Sequences known to belong to this class detected by the pattern ALL, 
except for Fibrobacter succinogenes endA whose sequence seems to be incorrect. 

[ 1] Beguin P. Annu. Rev. Microbiol. 44:219-248(1990). 

[ 2] Gilkes N.R., Henrissat B., Kilburn D.G., Miller R.C. Jr., Warren R.A.J. Microbiol. Rev. 
55:303-315(1991). 

[ 3] Henrissat B., Claeyssens M., Tomme P., Lemesle L., Mornon J. -P. Gene 81:83-95(1989). 
[ 4] Henrissat B. Biochem. J. 280:309-316(1991). 

[ 5] Tomme P., Chauvaux S., Beguin P., Millet L, Aubert J. -P., Claeyssens M. J. Biol. Chem. 
266:10313-10318(1991). 

[ 6] Tomme P., van Beeumen J., Claeyssens M. Biochem. J. 285:319-324(1992). 

258. Matrix protein (MA), pl5 (GAG_ma) 

The matrix protein, pl5, is encoded by the gag gene. MA is involved in pathogenicity 

[!]• 

[1] : Pozsgay JM, Beilharz MW, Wines BD, Hess AD, Pitha PM, J Virol 
1993;67:5989-5999. 

259. Gag polyprotein, inner coat protein pl2 (GAG_P12) 

The retroviral pl2 is a virion structural protein. pl2 is proline rich. The function 
carried out by pl2 in assembly and replication is unknown. pl2C is associated with 
pathogenicity of the virus 



Reference No. 



2750-942P 



269 

[1] Pozsgay JM, Beilharz MW, Wines BD, Hess AD, Pitha PM, J Virol 1993;67:5989-5999. 
260. Glutamine synthetase signatures (GLN-SYNT) 

Glutamine synthetase (EC 6.3.1.2 ) (GS) [1] plays an essential role in the metabolism of 
nitrogen by catalyzing the condensation of glutamate and ammonia to form glutamine. There 
seem to be three different classes of GS [2,3,4]: - Class I enzymes (GSI) are specific to 
prokaryotes, and are oligomers of 12 identical subunits. The activity of GSI-type enzyme is 
controlled by the adenylation of a tyrosine residue. The adenylated enzyme is inactive. - 
Class II enzymes (GSII) are found in eukaryotes and in bacteria belonging to the 
Rhizobiaceae, Frankiaceae, and Streptomycetaceae families (these bacteria have also a class-1 
GS). GSII are octamer of identical subunits. Plants have two or more isozymes of GSII, one 
of the isozymes is translocated into the chloroplast. - Class III enzymes (GSIII) has, 
currently, only been found in Bacteroides fragilis and in butyrivibrio fibrisolvens. It is a 
hexamer of identical chains. It is much larger (about 700 amino acids) than the GSI (450 to 
470 amino acids) or GSII (350 to 420 amino acids) enzymes. While the three classes of GS's 
are clearly structurally related, the sequence similarities are not so extensive. As signature 
patterns three conserved regions were selected. The first pattern is based on a conserved 
tetrapeptide in the N-terminal section of the enzyme, the second one is based on a glycine- 
rich region which is thought to be involved in ATP-binding. The third pattern is specific to 
class I glutamine synthetases and includes the tyrosine residue which is reversibly 
adenylated. 

Consensus pattern: [FYWL]-D-G-S-S-x(6,8)-[DENQSTAK]-[SA]-[DE]-x(2)-[LlVMFY]- 
Consensus pattern: K-P-[LIVMFYA]-x(3,5)-[NPAT]-G-[GSTAN]-G-x-H-x(3)-S- 
Consensus pattern: K-[LIVM]-x(5)-[LIVMA]-D-[RK]-[DN]-[LI]-Y [Y is the site of 
adenylation]- 

[ 1] Eisenberg D., Almassy R.J., Janson C.A., Chapman M.S., Suh S.W., Cascio D., Smith 
W.W. Cold Spring Harbor Symp. Quant. Biol. 52:483-490(1987). 

[ 2] Kumada Y., Benson D.R., Hillemann D., Hosted T.J., Rochefort D.A., Thompson C.J., 
Wohlleben W., Tateno Y. Proc. Natl. Acad. Sci. U.S.A. 90:3009-3013(1993). 
[ 3] Shatters R.G., Kahn M.L. J. Mol. Evol. 29:422-428(1989). 
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[ 4] Brown J.R., Masuchi Y., Robb F.T., Doolittle W.F. J. Mol. Evol. 38:566-576(1994). 
261. Globins profile (globinl) 

Globins are heme-containing proteins involved in binding and/or transporting oxygen [1]. 
They belong to a very large and well studied family which is widely distributed in many 
organisms. The major groups of globins are: - Hemoglobins (Hb) from vertebrates. Hb is the 
protein responsible for transporting oxygen from the lungs to other tissues. It is a tetramer of 
two alpha and two beta chains. Most vertebrate species also express specific embryonic or 
fetal forms of hemoglobin where the alpha or the beta chains are replaced by a chain with 
higher oxygen affinity, as for the gamma, delta, epsilon and zeta chains in mammals, for 
example. - Myoglobins (Mg) from vertebrates. Mg is a monomeric protein responsible for 
oxygen storage in muscles. - Invertebrate globins [2]. A wide variety of globins are found in 
invertebrates. Molluscs generally have one or two muscle globins which are either 
monomeric or dimeric. Insects, such as the midge Chironomus thummi, have a large set of 
extracellular globins. Nematodes and annelids have a variety of intracellular and extracellular 
globins; some of them are multi- domain polypeptides (from two up to nine-domain globins) 
and some produce large, disulfide-bonded aggregates. - Leghemoglobins (Lg) from the root 
nodules of leguminous plants. Lg provides oxygen for bacteroids. - Flavohemoproteins from 
bacteria (Escherichia coli hmpA) and fungi [3]. These proteins consist of two distinct 
domains: an N-terminal globin domain and a C-terminal FAD -containing reductase domain. 
In bacteria such as Vitreoscilla, the enzyme-associated globin is a single domain protein. All 
these globins seem to have evolved from a common ancestor. The profile developed to detect 
members of the globin family is based on a structural alignment of selected globin sequences 
[ 1] Concise Encyclopedia Biochemistry, Second Edition, Walter de Gruyter, Berlin New- 
York (1988).[ 2] Goodman M., Pedwaydon J., Czelusniak J., Suzuki T., Gotoh T., Moens L., 
Shishikura F., Walz D., Vinogradov S. J. Mol. Evol. 27:236-249(1988). 

Plant hemoglobins signature (globin2) 

Leghemoglobins [1] are hemoproteins present in the root nodules of leguminousplants. 
Leghemoglobins are structurally and functionally related to hemoglobin and myoglobin. By 
providing oxygen to the bacteroids, they are essential for symbiotic nitrogen fixation. 
Structurally related hemoglobins from the nodules of non-leguminous plants [2,3], and from 
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the roots of non-nodulating plants[4] have been recently sequenced. A signature pattern was 
developed that picks up the sequence of plants hemoglobins, exclusively. 

Consensus pattern: [SN]-P-x-L-x(2)-H-A-x(3)-F- 

[ 1] Powell R., Gannon F. BioEssays 9:117-121(1988). 

[ 2] Kortt A.A., Trinick M.J., Appleby C.A. Eur. J. Biochem. 175:141-149(1988). 
[ 3] Kortt A.A., Inglis A.S., Fleming A.I., Appleby C.A. FEES Lett. 231:341-346(1988). 
[ 4] Bogusz D., Appleby C.A., Landsmann J., Dennis E.S., Trinick M.J., Peacock W.J. 
Nature 331:178-180(1988). 

262. Fructose-bisphosphate aldolase class-I active site (glycolytic_enz) 

Fructose-bisphosphate aldolase [1,2] is a glycolytic enzyme that catalyzes the 
reversible aldol cleavage or condensation of fructose-l,6-bisphosphate into 
dihydroxyacetone-phosphate and glyceraldehyde 3-phosphate.There are two classes of 
fructose-bisphosphate aldolases with different catalytic mechanisms. Class-I aldolases [3], 
mainly found in higher eukaryotes, are homotetrameric enzymes which form a Schiff-base 
intermediate between the C-2 carbonyl group of the substrate (dihydroxyacetone 
phosphate)and the epsilon-amino group of a lysine residue. In vertebrates, three forms of this 
enzyme are found: aldolase A in muscle, aldolase B in liver and aldolase C in brain. The 
sequence around the lysine involved in the Schiff-base is highly conserved and can be used as 
a signature for this class of enzyme. 

Consensus pattern: [LIVM]-x-[LIVMFYW]-E-G-x-[LS]-L-K-P-[SN] [K is involved in 
Schiff-base formation] - 

[ 1] Perham R.N. Biochem. Soc. Trans. 18:185-187(1990). 

[ 2] Marsh J. J., Lebherz H.G. Trends Biochem. Sci. 17:110-113(1992). 

[ 3] Freemont P.S., Dunbar B., Fothergill-Gilmore L.A. Biochem. J. 249:779-788(1988). 

263. Glycosyl hydrolases family 11 active sites signatures 
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The microbial degradation of cellulose and xylans requires several types of enzymes such as 
endoglucanases (EC 3.2.1.4 ), cellobiohydrolases (EC 3.2.1.91 ) (exoglucanases), or xylanases 
(EC 3.2.1.8 ) [1,2]. Fungi and bacteria produces a spectrum of cellulolytic enzymes 
(cellulases) and xylanases which, on the basis of sequence similarities, can be classified into 
families. One of these families is known as the cellulase family G [3] or as the glycosyl 
hydrolases family 11 [4,E1]. The enzymes which are currently known to belong to this family 
are listed below. - Aspergillus awamori xylanase C (xynC). - Bacillus circulans, pumilus, 
stearothermophilus and subtilis xylanase (xynA). - Clostridium acetobutylicum xylanase 
(xynB). - Clostridium stercorarium xylanase A (xynA). - Fibrobacter succinogenes xylanase 
C (xynC) which consist of two catalytic domains that both belong to family 10. - 
Neocallimastix patriciarum xylanase A (xynA). - Ruminococcus flavefaciens bifunctional 
xylanase XYLA (xynA). This protein consists of three domains: a N-terminal xylanase 
catalytic domain that belongs to family 11 of glycosyl hydrolases; a central domain 
composed of short repeats of Gin, Asn an Trp, and a C-terminal xylanase catalytic domain 
that belongs to family 10 of glycosyl hydrolases. - Schizophyllum commune xylanase A. - 
Streptomyces lividans xylanases B (xlnB) and C (xlnC). - Trichoderma reesei xylanases I and 
II. Two of the conserved regions in these enzymes are centered on glutamic acidresidues 
which have both been shown [5], in Bacillus pumilis xylanase, to be necessary for catalytic 
activity. Both regions were used as signature patterns. 

Consensus pattern: [PSA]-[LQ]-x-E-Y-Y-[LIVM](2)-[DE]-x-[FYWHN] [E is an active site 
residue] - 

Consensus pattern: [LIVMF]-x(2)-E-[AG]-[YWG]-[QRFGS]-[SG]-[STAN]-G-x-[SAF] [E is 
an active site residue] - 

[ 1] Beguin P. Annu. Rev. Microbiol. 44:219-248(1990). 

[ 2] Gilkes N.R., Henrissat B., Kilburn D.G., Miller R.C. Jr., Warren R.A.J. Microbiol. Rev. 
55:303-315(1991). 

[ 3] Henrissat B., Claeyssens M., Tomme P., Lemesle L., Mornon J. -P. Gene 81:83-95(1989). 
[ 4] Henrissat B. Biochem. J. 280:309-316(1991). 

[ 5] Ko E.P., Akatsuka H., Moriyama H., Shinmyo A., Hata Y., Katsube Y., Urabe I., Okada 
H. Biochem. J. 288:117-121(1992). 
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264. Glycosyl hydrolase family 14 

This family are beta amylases. 

265. Glycosyl hydrolases family 1 signatures 

It has been shown [1 to 4] that the following glycosyl hydrolases can be, on the basis of 
sequence similarities, classified into a single family: - Beta-glucosidases (EC 3.2.1.21 ) from 
various bacteria such as Agrobacterium strain ATCC 21400, Bacillus polymyxa, and 
Caldocellum saccharolyticum. - Two plants (clover) beta-glucosidases (EC 3.2.1.21 ). - Two 
different beta-galactosidases (EC 3.2.1.23 ) from the archaebacteria Sulfolobus solfataricus 
(genes bgaS and lacS). - 6-phospho-beta-galactosidases (EC 3.2.1.85 ) from various bacteria 
such as Lactobacillus casei, Lactococcus lactis, and Staphylococcus aureus. - 6-phospho- 
beta-glucosidases (EC 3.2.1.86 ) from Escherichia coli (genes bglB and ascB) and from 
Erwinia chrysanthemi (gene arbB). - Plants myrosinases (EC 3.2.3.1 ) (sinigrinase) 
(thioglucosidase). - Mammalian lactase-phlorizin hydrolase (LPH) (EC 3.2.1.108 / EC 
3.2.1.62 ). LPH, an integral membrane glycoprotein, is the enzyme that splits lactose in the 
small intestine. LPH is a large protein of about 1900 residues which contains four tandem 
repeats of a domain of about 450 residues which is evolutionary related to the above glycosyl 
hydrolases. One of the conserved regions in these enzymes is centered on a conserved 
glutamic acid residue which has been shown [5], in the beta-glucosidase from 
Agrobacterium, to be directly involved in glycosidic bond cleavage by acting as a 
nucleophile. This region was used as a signature pattern. As a second signature pattern a 
conserved region was selected, found in the N-terminal extremity of these enzymes, this 
region also contains a glutamic acid residue. 

Consensus pattern: [LIVMFSTC]-[L1VFYS]-[LIV]-[LIVMST]-E-N-G-[LIVMFAR]- 
[CSAGN] [E is the active site residue] 

Note: this pattern will pick up the last two domains of LPH; the first two domains, which are 
removed from the LPH precursor by proteolytic processing, have lost the active site 
glutamate and may therefore be inactive [4]. 

Consensus pattern: F-x-[FYWM]-[GSTA]-x-[GSTA]-x-[GSTA](2)-[FYNH]-[NQ]-x-E-x- 
[GSTA]- 
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[ 1] Henrissat B. Biochem. J. 280:309-316(1991). 

[ 2] Henrissat B. Protein Seq. Data Anal. 4:61-62(1991). 

[ 3] Gonzalez-Candelas L., Ramon D., Polaina J. Gene 95:31-38(1990). 

[ 4] El Hassouni M., Henrissat B., Chippaux M., Barras F. J. Bacteriol. 174:765-777(1992). 

[ 5] Withers S.G., Warren R.A.J., Street LP., Rupitz K., Kempton J.B., Aebersold R. J. Am. 

Chem. Soc. 112:5887-5889(1990). 

266. Glycosyl hydrolases family 2 signatures 

It has been shown [1,2,E1] that the following glycosyl hydrolases can be, on the basis of 
sequence similarities, classified into a single family: - Beta-galactosidases (EC 3.2.1.23 ) from 
bacteria such as Escherichia coli (genes lacZ and ebgA), Clostridium acetobutylicum, 
Clostridium thermosulfurogenes, Klebsiella pneumoniae, Lactobacillus delbrueckii, or 
Streptococcus thermophilus and from the fungi Kluyveromyces lactis. - Beta-glucuronidase 
(EC 3.2.1.31 ^ from Escherichia coli (gene uidA) and from mammals. One of the conserved 
regions in these enzymes is centered on a conserved glutamic acid residue which has been 
shown [3], in Escherichia coli lacZ, to be the general acid/base catalyst in the active site of 
the enzyme. This region was used as a signature pattern. As a second signature pattern a 
highly conserved region was selected located some sixty residues upstream from the active 
site glutamate. 

Consensus pattern: N-x-[LIVMFYWD]-R-[STACN](2)-H-Y-P-x(4)-[LIVMFYWS](2)-x(3)- 
[DN]-x(2)-G-[LIVMFYW](4)- 

Consensus pattern: [DENQLF]-[KRVW]-N-[HRY]-[STAPV]-[SAC]-[LIVMFS](3)-W-[GS]- 
x(2,3)-N-E [E is the active site residue]- 

[ 1] Henrissat B. Biochem. J. 280:309-316(1991). 

[ 2] Schroeder C.J., Robert C, Lenzen G., McKay L.L., Mercenier A. J. Gen. Microbiol. 
137:369-380(1991). 

[ 3] Gebler J.C., Aebersold R., Withers S.G. J. Biol. Chem. 267:11126-11130(1992). 
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267. Glycosyl hydrolases family 3 active site 

It has been shown [1,2] that the following glycosyl hydrolases can be, on the basis of 
sequence similarities, classified into a single family: 

- Beta glucosidases (EC 3.2.1.21) from the fungi Aspergillus wentii (A-3), 
Hansenula anomala, Kluyveromyces fragilis, Saccharomycopsis fibuligera, 
(BGLl and BGL2), Schizophyllum commune and Trichoderma reesei (BGLl). 

- Beta glucosidases from the bacteria Agrobacterium tumefaciens (Cbgl), 
Butyrivibrio fibrisolvens (bglA), Clostridium thermocellum (bglB), 
Escherichia coli (bglX), Erwinia chrysanthemi (bgxA) and Ruminococcus 
albus. - Alteromonas strain 0-7 beta-hexosaminidase A (EC 3.2.1.52). 

- Bacillus subtilis hypothetical protein yzbA. 

- Escherichica coli hypothetical protein ycfO and HI0959, the corresponding 
Haemophilus influenzae protein. 

One of the conserved regions in these enzymes is centered on a conserved 
asparticacid residue which has been shown [3], in Aspergillus wentii beta- 
glucosidase A3, to be implicated in the catalytic mechanism. This 
region was used as a signature pattern. 

Consensus pattern: [LIVM](2)-[KR]-x-[EQK]-x(4)-G-[LIVMFT]-[LIVT]-[LIVMF]-[ST]-D- 
x(2)-[SGADNI] [D is the active site residue] 

[ 1] Henrissat B. Biochem. J. 280:309-316(1991). 

[ 2] Castle L.A., Smith K.D., Morris R.O. J. Bacteriol. 174:1478-1486(1992). 
[ 3] Bause E., Legler G. Biochim. Biophys. Acta 626:459-465(1980). 

268. Glycosyl hydrolases family 8 signature 

The microbial degradation of cellulose and xylans requires several types of enzymes such as 
endoglucanases (EC 3.2.1.4 \ cellobiohydrolases (EC 3.2.1.91 Vexoglucanases), or xylanases 
(EC 3.2.1.8 ^ [1,2]. Fungi and bacteria produces a spectrum of cellulolytic enzymes 
(cellulases) and xylanases which, on the basis of sequence similarities, can be classified into 
families. One of these families is known as the cellulase family D [3] or as the glycosyl 
hydrolases family 8 [4,E1]. The enzymes which are currently known to belong to this family 
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are listed below. - Acetobacter xylinum endonuclease cmcAX. - Bacillus strain KSM-330 
acidic endonuclease K (Endo-K). - Cellulomonas josui endoglucanase 2 (celB). - 
Cellulomonas uda endoglucanase. - Clostridium cellulolyticum endoglucanases C (celcCC). - 
Clostridium thermocellura endoglucanases A (celA). - Erwinia chrysanthemi minor 
5 endoglucanase y (celY). - Bacillus circulans beta-glucanase (EC 3.2.1.73 ). - Escherichia coli 
hypothetical protein yhjM. The most conserved region in these enzymes is a stretch of about 
20 residues that contains two conserved aspartate. The first asparatate is thought [5] to act as 
the nucleophile in the catalytic mechanism. This region was used as a signature pattern. 

1 0 Consensus pattern: A-[ST]-D-[AG]-D-x(2)-[IM]-A-x-[SA]-[LIVM]-[LIVMG]-x-A- x(3)- 
[FW] [The first D is an active site residue] - 

[ 1] Beguin P. Annu. Rev. Microbiol. 44:219-248(1990). 

[ 2] Gilkes N.R., Henrissat B., Kilburn D.G., Miller R.C. Jr., Warren R.A.J. Microbiol. Rev. 
15 55:303-315(1991). 

[ 3] Henrissat B., Claeyssens M., Tomme P., Lemesle L., Mornon J.-P. Gene 81:83-95(1989). 
[ 4] Henrissat B. Biochem. J. 280:309-316(1991). 

[ 5] Alzari P.M., Souchon H., Dominguez R. Structure 4:265-275(1996). 

20 

269. Glycosyl hydrolases family 9 active sites signatures 

The microbial degradation of cellulose and xylans requires several types of enzymes such as 
endoglucanases (EC 3.2.1.4 ), cellobiohydrolases (EC 3.2.1.91 ) (exoglucanases), or xylanases 
(EC 3.2.1.8 ) [1,2]. Fungi and bacteria produce a spectrum of cellulolytic enzymes (cellulases) 

25 and xylanases which, on the basis of sequence similarities, can be classified into families. 
One of these families is known as the cellulase family E [3] or as the glycosyl hydrolases 
family 9 [4,E1]. The enzymes which are currently known to belong to this family are listed 
below. - Butyrivibrio fibrisolvens cellodextrinase 1 (cedl). - Cellulomonas fimi 
endoglucanases B (cenB) and C (cenC). - Clostridium cellulolyticum endoglucanase G 

30 (celCCG). - Clostridium cellulovorans endoglucanase C (engC). - Clostridium stercoararium 
endoglucanase Z (avicelase I) (celZ). - Clostridium thermocellum endoglucanases D (celD), 
F (celF) and I (cell). - Fibrobacter succinogenes endoglucanase A (endA). - Pseudomonas 
fluorescens endoglucanase A (celA). - Streptomyces reticuli endoglucanase 1 (cell). - 
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Thermomonospora fusca endoglucanase E-4 (celD). - Dictyostelium discoideum spore 
germination specific endoglucanase 270-6. This slime mold enzyme may digest the spore cell 
wall during germination, to release the enclosed amoeba. - Endoglucanases from plants such 
as Avocado or French bean. In plants this enzyme may be involved the fruit ripening process. 
5 Two of the most conserved regions in these enzymes are centered on conserved residues 
which have been shown [5,6], in the endoglucanase D from Cellulomonas thermocellum, to 
be important for the catalytic activity. The first region contains an active site histidine and the 
second region contains two catalytically important residues: an aspartate and a glutamate. 
Both regions were used as signature patterns. 

10 

Consensus pattern: [STV]-x-[LIVMFY]-[STV]-x(2)-G-x-[NKR]-x(4)-[PLIVM]-H-x-R [H is 
an active site residue] - 

Consensus pattern: [FYW]-x-D-x(4)-[FYW]-x(3)-E-x-[STA]-x(3)-N-[STA] [D and E are 
active site residues] - 

15 

[ 1] Beguin P. Annu. Rev. Microbiol. 44:219-248(1990). 

[ 2] Gilkes N.R., Henrissat B., Kilburn D.G., Miller R.C. Jr., Warren R.A.J. Microbiol. Rev. 
55:303-315(1991). 

[ 3] Henrissat B., Claeyssens M., Tomme P., Lemesle L., Mornon J.-P. Gene 81:83-95(1989). 

2 0 [4] Henrissat B. Biochem. J. 280:309-316(1991). 

[ 5] Tomme P., Chauvaux S., Beguin P., Millet J., Aubert J.-P., Claeyssens M. J. Biol. Chem. 
266:10313-10318(1991). 

[ 6] Tomme P., van Beeumen J., Claeyssens M. Biochem. J. 285:319-324(1992). 

25 

270. Glyceraldehyde 3-phosphate dehydrogenase active site (gpdh) 
Glyceraldehyde 3-phosphate dehydrogenase (EC 1.2.1.12 ) (GAPDH) [1] is a tetrameric 
NAD-binding enzyme common to both the glycolytic and gluconeogenic pathways. A 
cysteine in the middle of the molecule is involved in forming a covalent phosphoglycerol 

3 0 thioester intermediate. The sequence around this cysteine is totally conserved in eubacterial 

and eukaryotic GAPDHs and is also present, albeit in a variant form, in the otherwise highly 
divergent archaebacterial GAPDH [2]. Escherichia coli D-erythrose 4-phosphate 
dehydrogenase (E4PDH) (gene epd orgapB) is an enzyme highly related to GAPDH [3]. 
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Consensus pattern: [ASV]-S-C-[NT]-T-x(2)-[LIM] [C is the active site residue]- 

[ 1] Harris J.I., Waters M. (In) The Enzymes (3rd edition) 13:1-50(1976). 

[ 2] Fabry S., Lang J., Niermann T., Vingron M., Hensel R. Eur. J. Biochem. 179:405- 

413(1989). 

[ 3] Zhao G., Pease A.J., Bharani N., Winkler M.E. J. Bacteriol. 177:2804-2812(1995). 
271. Granulins signature 

Granulins [1] are a family of cysteine-rich peptides of about 6 Kd which may have multiple 
biological activity. A precursor protein (known as acrogranin) potentially encodes seven 
different forms of granulin (grnA to grnG) which are probably released by post-translational 
proteolytic processing. A schematic representation of the structure of a granulin is shown 
below: xxxCxxxxxCxxxxxCCxxxxxxxxCCxxxxxxCCxxxxxCCxxxxxCxxxxxxCx 
*******=f=******i(^'. conserved cysteine probably involved in a disulfide bond.'*': position of 
the pattern. Granulins are evolutionary related to a PMP-Dl, a peptide extracted from thepars 
intercerebralis of migratory locusts [2]. 

Consensus pattern: C-x-D-x(2)-H-C-C-P-x(4)-C [The four Cs are probably involved in 
disulfide bonds] - 

[ 1] Bhandari V., Palfree R.G., Bateman A. Proc. Natl. Acad. Sci. U.S.A. 89:1715- 
1719(1992). 

[ 2] Nakakura N., Hietter H., van Dorsselaer A., Luu B. Eur. J. Biochem. 204:147-153(1992). 

272. (HCV RdRp) Hepatitis C virus RNA dependent RNA polymerase 

The RNA dependent RNA polymerase is also known as 
non-structural protein NS5B. NS5B is a 65 kDa protein 
that resembles other viral RNA polymerases. HCV replication 
is thought to occur in membrane bound replication 
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complexes. These complexes transcribe the positive 

strand and the resulting minus strand is used as a 

template for the synthesis of genomic RNA. There are 

two viral proteins involved in the reaction, NS3 and NS5B.[1,2] 

[1] Lohmann V, Korner F, Herian U, Bartenschlager R; 
J Virol 1997;71:8416-8428. [2] Behrens SE, Tomei L, De Francesco R; 
EMBO J 1996;15:12-22. [3] Ishido S, Fujita T, Hotta H; 
Biochem Biophys Res Commun 1998;244:35-40. 

273. (HHH) Helix-hairpin-helix motif. 

[1] Doherty AJ, Serpell LC, Ponting CP; Nucleic Acids Res 1996;24:2488-2497. 

274. HIT family signature 

Recently a family of small proteins of about 12 to 16 Kd has been described[l]. This family 
currently consists of: - Mammalian protein HINT (also known as Protein kinase C inhibitor 1 
or PKCI- 1). HINT was incorrectly thought to be a specific inhibitor of PKC. It has been 
shown to bind zinc. - Fission yeast diadenosine 5',5"'-Pl,P4-tetraphosphate asymmetrical 
hydrolase (Ap4Aase) (EC 3.6.1.17 ) [2] (gene aphl), which cleaves A-5'-PPPP- 5'A to yield 
AMP and ATP. - FHIT, a human protein whose gene is altered in different tumors and which 
acts [3] as a diadenosine 5',5"'-Pl,P3-triphosphate hydrolase (Ap3Aase) (EC 3.6.1.29) 
cleaving A-5'-PPP-5'A to yield AMP and ADP. - Yeast proteins HNTl and HNT2. - Maize 
zinc-binding protein ZBP14. - Escherichia coli hypothetical protein ycfF. - Haemophilus 
influenzae hypothetical protein HI0961. - Helicobacter pylori hypothetical protein HP0404. - 
Methanococcus jannaschii hypothetical protein MJ0866. - Mycobacterium leprae 
hypothetical protein U296A. - Synechocystis strain PCC 6803 hypothetical protein slrl234. - 
Caenorhabditis elegans hypothetical protein F21C3.3. - A hypothetical 13.2 Kd protein in 
hisE 3'region in Azospirillum brasilense. - A hypothetical 13.1 Kd protein in p37 5'region in 
Mycoplasma hyorhinis. - A hypothetical 12.4 Kd protein in psbAII 5'region in 
Synechococcus strain PCC 7942.A11 these proteins contains a region with three clustered 
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histidines. This region is responsible for the designation of this family: HIT, for 
'HIstidineTriad [1]. This region was originally thought to be implied in the binding of a zinc 
ion but was later identified [4] as part of the alpha-phosphate binding site of a nucleotide- 
binding domain. As a signature pattern, the region of the histidine triad was selected. 

Consensus pattern: [NQA]-x(4)-[GAV]-x-[QF]-x-[LIVM]-x-H-[LIVMFYT]-H-[LIVMFT]- 
H-[LIVMF](2)-[PSGA]- 

[ 1] Seraphin B. DNA Seq. 3:177-179(1992). 

[ 2] Huang Y., Garrison P.N., Barnes L.D. Biochem. J. 312:925-932(1995). 

[ 3] Barnes L.D., Garrison P.N., Siprashvili Z., Guranowski A., Robinson A.K., Ingram S.W., 

Croce CM., Ohta M., Huebner K. Biochemistry 35:11529-11535(1996). 

[ 4] Brenner C, Garrison P., Gilmour J., Peisach D., Ringe D., Petsko G.A., Lowenstein J.M. 

Nat. Struct. Biol. 4:231-238(1997). 

275. Myc-type, 'helix-loop-helix' dimerization domain signature (HLH) 
A number of eukaryotic proteins, which probably are sequence specific DNA-binding 
proteins that act as transcription factors, share a conserved domain of 40 to 50 amino acid 
residues. It has been proposed [1] that this domain is formed of two amphipathic helices 
joined by a variable length linker region that could form a loop. This 'helix-loop-helix' (HLH) 
domain mediates protein dimerization and has been found in the proteins listed below 
[2,3,E1,E2]. Most of these proteins have an extra basic region of about 15 amino acid 
residues that is adjacent to the HLH domain and specifically binds to DNA. They are refered 
as basic helix-loop-helix proteins (bHLH), and are classified in two groups: class A 
(ubiquitous) and class B (tissue-specific). Members of the bHLH family bind variations on 
the core sequence 'CANNTG', also referred to as the E-box motif. The homo- or 
heterodimerization mediated by the HLH domain is independent of, but necessary for DNA 
binding, as two basic regions are required for DNA binding activity. The HLH proteins 
lacking the basic domain (Emc, Id) function as negative regulators since they form 
heterodimers, but fail to bind DNA. The hairy-related proteins (hairy, E(spl), deadpan) also 
repress transcription although they can bind DNA. The proteins of this subfamily act together 
with co-repressor proteins, like groucho, through their C-terminal motif WRPW. - The myc 
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family of cellular oncogenes [4], which is currently known to contain four members: c-myc 
[E3], N-myc, L-myc, and B-myc. The myc genes are thought to play a role in cellular 
differentiation and proliferation. - Proteins involved in myogenesis (the induction of muscle 
cells). In mammals MyoDl (Myf-3), myogenin (Myf-4), Myf-5, and Myf-6 (Mrf4 or 
herculin), in birds CMDl (QMF-1), in Xenopus MyoD and MF25, in Caenorhabditis elegans 
CeMyoD, and in Drosophila nautilus (nau). - Vertebrate proteins that bind specific DNA 
sequences ('E boxes') in various immunoglobulin chains enhancers: E2A or ITF-1 (E12/pan-2 
and E47/pan-l), ITF-2 (tcf4), TFE3, and TFEB. - Vertebrate neurogenic differentiation factor 
1 that acts as differentiation factor during neurogenesis. - Vertebrate MAX protein, a 
transcription regulator that forms a sequence- specific DNA-binding protein complex with 
myc or mad. - Vertebrate Max Interacting Protein 1 (MXIl protein) which acts as a 
transcriptional repressor and may antagonize myc transcriptional activity by competing for 
max. - Proteins of the bHLH/PAS superfamily which are transcriptional activators. In 
mammals, AH receptor nuclear translocator (ARNT), single-minded homologs (SIMl and 
SIM2), hypoxia-inducible factor 1 alpha (HIFIA), AH receptor (AHR), neuronal pas domain 
proteins (NPASl and NPAS2), endothelial pas domain protein 1 (EPASl), mouse ARNT2, 
and human BMALl. In drosophila, single-minded (SIM), AH receptor nuclear translocator 
(ARNT), trachealess protein (TRH), and similar protein (SIMA). - Mammalian transcription 
factors HES, which repress transcription by acting on two types of DNA sequences, the E 
box and the N box. - Mammalian MAD protein (max dimerizer) which acts as transcriptional 
repressor and may antagonize myc transcriptional activity by competing for max. - 
Mammalian Upstream Stimulatory Factor 1 and 2 (USFl and USF2), which bind to a 
symmetrical DNA sequence that is found in a variety of viral and cellular promoters. - 
Human lyl-1 protein; which is involved, by chromosomal translocation, in T- cell leukemia. - 
Human transcription factor AP-4. - Mouse helix-loop-helix proteins MATH-1 and MATH-2 
which activate E box- dependent transcription in collaboration with E47. - Mammalian stem 
cell protein (SCL) (also known as tall), a protein which may play an important role in 
hemopoietic differentiation. SCL is involved, by chromosomal translocation, in stem-cell 
leukemia. - Mammalian proteins Idl to Id4 [5]. Id (inhibitor of DNA binding) proteins lack a 
basic DNA-binding domain but are able to form heterodimers with other HLH proteins, 
thereby inhibiting binding to DNA. - Drosophila extra-macrochaetae (emc) protein, which 
participates in sensory organ patterning by antagonizing the neurogenic activity of the 
achaete- scute complex. Emc is the homolog of mammalian Id proteins. - Human Sterol 
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Regulatory Element Binding Protein 1 (SREBP-1), a transcriptional activator that binds to the 
sterol regulatory element 1 (SRE-1) found in the flanking region of the LDLR gene and in 
other genes. - Drosophila achaete-scute (AS-C) complex proteins T3 (I'sc), T4 (scute), T5 
(achaete) and T8 (asense). The AS-C proteins are involved in the determination of the 
neuronal precursors in the peripheral nervous system and the central nervous system. - 
Mammalian homologs of achaete-scute proteins, the MASH-1 and MASH-2 proteins. - 
Drosophila atonal protein (ato) which is involved in neurogenesis. - Drosophila daughterless 
(da) protein, which is essential for neurogenesis and sex-determination. - Drosophila deadpan 
(dpn), a hairy-like protein involved in the functional differentiation of neurons. - Drosophila 
delilah (del) protein, which is plays an important role in the differentiation of epidermal cells 
into muscle. - Drosophila hairy (h) protein, a transcriptional repressor which regulates the 
embryonic segmentation and adult bristle patterning. - Drosophila enhancer of split proteins 
E(spl), that are hairy-like proteins active during neurogenesis, also act as transcriptional 
repressors. - Drosophila twist (twi) protein, which is involved in the establishment of germ 
layers in embryos. - Maize anthocyanin regulatory proteins R-S and LC. - Yeast centromere- 
binding protein 1 (CPFl or CBFl). This protein is involved in chromosomal segregation. It 
binds to a highly conserved DNA sequence, found in centromers and in several promoters. - 
Yeast IN02 and IN04 proteins. - Yeast phosphate system positive regulatory protein PH04 
which interacts with the upstream activating sequence of several acid phosphatase genes. - 
Yeast serine-rich protein TYE7 that is required for ty-mediated ADH2 expression. - 
Neurospora crassa nuc-1, a protein that activates the transcription of structural genes for 
phosphorus acquisition. - Fission yeast protein escl which is involved in the sexual 
differentiation process. The schematic representation of the helix-loop-helix domain is shown 

here: xxxxxxxxxxxxxxxxxxxxxxxx xxxxxxxxxxxxxxxxxxxxxxx 

Amphipathic helix 1 Loop Amphipathic helix 2. The signature pattern developed to detect 
this domain spans completely the second amphipathic helix. 

Consensus pattern: [DENSTAP]-[KTR]-[LIVMAGSNT]-{FYWCPHKR}-[LIVMT]- 

[LIVM]-x(2)-[STAV]-[LIVMSTACKR]-x-[VMFYH]-[LIVMTA]-{P}-{P}- 

[LIVMRKHQ].- 



[ 1] Murre C, McCaw P.S., Baltimore D. Cell 56:777-783(1989). 
[ 2] Garrel J., Campuzano S. BioEssays 13:493-498(1991). 
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[ 3] Kato G.J., Dang C.V. FASEB J. 6:3065-3072(1992). 

[ 4] Krause M., Fire A., Harrison S.W., Priess J., Weintraub H. Cell 63:907-919(1990). 
[ 5] Riechmann V., van Cruechten I., Sablitzky F. Nucleic Acids Res. 22:749-755(1994). 

276. HMG14 and HMG17 signature 

High mobility group (HMG) proteins are a family of relatively low molecular weight non- 
histone components in chromatin. HMG14 and HMG17 [1], two related proteins of about 100 
amino acid residues, bind to the inner side of the nucleosomal DNA thus altering the 
interaction between the DNA and the histone octamer. These two proteins may be involved in 
the process which maintains transcribable genes in a unique chromatin conformation. The 
trout nonhistone chromosomal protein H6 (histone T) also belongs to this family. As a 
signature pattern a conserved stretch of 10 residues located in the N-terminal section of 
HMG14 and HMG17 was selected. 

Consensus pattern: R-R-S-A-R-L-S-A-[RK]-P- 

[ 1] Bustin M., Reeves R. Prog. Nucleic Acid Res. Mol. Bioh 54:35-100(1996). 

277. Hydroxymethylglutaryl-coenzyme A lyase active site (HMGLl) 
3-hydroxy-3-methylglutaryl-coenzyme A lyase (HMG-CoA lyase or HL) (EC 
4.1.3.4 >catalvzes the transformation of HMG-CoA into acetyl-CoA and acetoacetate. In 
vertebrates it is a mitochondrial enyme which is involved in ketogenesis and in leucine 
catabolism [1]. In some bacteria, such as Pseudomonas mevalonii, it is involved in 
mevalonate catabolism (gene mvaB). A cysteine has been shown[2], in mvaB, to be required 
for the activity of the enzyme. The region around this residue is perfectly conserved and is 
used as a signature pattern. 

Consensus pattern: S-V-A-G-L-G-G-C-P-Y [C is the active site residue]- 
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[ 1] Mitchell G.A., Robert M.-F., Hruz P.W., Wang S., Fontaine G., Behnke C.E., Mende- 
Mueller L.M., Schappert K., Lee C, Gibson K.M., Miziorko H.M. J. Biol. Chem. 268:4376- 
4381(1993). 

[ 2] Hruz P.W., Narasimhan C, Miziorko H.M. Biochemistry 31:6842-6847(1992). 

Alpha-isopropylmalate and homocitrate synthases signatures (HMGL2) 
The following enzymes have been shown [1] to be functionally as well as evolutionary 
related: - Alpha-isopropylmalate synthase (EC 4.1.3.12 ) which catalyzes the first step in the 
biosynthesis of leucine, the condensation of acetyl-CoA and alpha- ketoisovalerate to form 2- 
isopropylmalate synthase. - Homocitrate synthase (EC 4.1.3.21) (gene nifV) which is 
involved in the biosynthesis of the iron-molybdenum cofactor of nitrogenase and catalyzes 
the condensation of acetyl-CoA and alpha-ketoglutarate into homocitrate. - Soybean late 
nodulin 56. - Methanococcus jannaschii hypothetical proteins MJ0503, MJ1195 and MJ1392. 
Two conserved regions were selected as signature patterns for these enzymes. The first region 
is located in the N-terminal section while the second region is located in the central section 
and contains two conserved histidine residues which could be implicated in the catalytic 
mechanism. 

Consensus pattern: L-R-[DE]-G-x-Q-x(10)-K- 

Consensus pattern: [LIVMFW]-x(2)-H-x-H-[DN]-D-x-G-x-[GAS]-x-[GASLI]- 

[ 1] Wang S.-Z., Dean D.R., Chen J.-S., Johnson J.L. J. Bacteriol. 173:3041-3046(1991). 

278. (HMG COA synt) Hydroxymethylglutaryl-coenzyme A synthase active site 
Hydroxymethylglutaryl-coenzyme A synthase (EC 4.1.3.5 ) (HMG-CoA synthase) catalyzes 
the condensation of acetyl-CoA with acetoacetyl-CoA to produce HMG- CoA and CoA [l].In 
vertebrates there are two isozymes located in different subcellular compartments: a cytosolic 
form which is the starting point of the mevalonate pathway which leads to cholesterol and 
other sterolic and isoprenoid compounds and a mitochondrial form responsible for ketone 
body biosynthesis. HMG-CoA is also found in other eukaryotes such as insect, plants and 
fungi. A cysteine is known to act as the catalytic nucleophile in the first step of the reaction. 
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the acetylation of the enzyme by acetyl-CoA. The conserved region was used around this 
active site residue as a signature pattern. 

Consensus pattern: N-x-[DN]-[IV]-E-G-[IV]-D-x(2)-N-A-C-[FY]-x-G [C is the active site 
residue] - 

[ 1] Rokosz LX., Boulton D.A., Butkiewicz E.A., Sanyal G., Cueto M.A., Lachance P.A., 
Hermes J.D. Arch. Biochem. Biophys. 312:1-13(1994). 

279. HMG (high mobility group) box 

280. HSF-type DNA-binding domain signature 

Heat shock factor (HSF) is a DNA-binding protein that specifically binds heat shock 
promoter elements (HSE). HSE is a palindromic element rich with repetitive purine and 
pyrimidine motifs: 5'-nGAAnnTTCnnGAAnnTTCn-3'. HSF is expressed at normal 
temperatures but is activated by heat shock or chemical stressors [1,2]. The sequences of HSF 
from various species show extensive similarity in a region of about 90 amino acids, which 
has been shown [3] to bind DNA. Some other proteins also contain a HSF domain, these are: 
- Yeast SFLl, a protein involved in cell surface assembly and regulation of the gene related 
to flocculation (asexual cell aggregation) [4]. - Yeast transcription factor SKN7 (or BRYl or 
POS9), which binds to the promoter elements SCB and MCB essential for the control of Gl 
cyclins expression [5]. - Yeast MGAl. - Yeast hypothetical protein YJR147w. A pattern from 
the most conserved part of the HSF DNA-binding domain was derived, its central region. 

Consensus pattern: L-x(3)-[FY]-K-H-x-N-x-[STAN]-S-F-[LIVM]-R-Q-L-[NH]-x-Y-x- 
[FYW]-[RKH]-K-[LIVM]- 

[ 1] Sorger P.K. Cell 65:363-366(1991). 

[ 2] Mager W.H., Moradas Ferreira P. Biochem. J. 290:1-13(1993). 

[ 3] Vuister G.W., Kim S.-J., Orosz A., Marquardt J., Wu C, Bax A. Nat. Struct. Biol. 1:605- 
613(1994). 
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[ 4] Fujita A., Kikuchi Y., Kuhara S., Misumi Y., Matsumoto S., Kobayashi H. Gene 85:321- 
328(1989). 

[ 5] Morgan B.A., Bouquin N., Merrill G.F., Johnston L.H. EMBO J. 14:5679-5689(1995). 
281. Heat shock hsp20 proteins family profile 

Prokaryotic and eukaryotic organisms respond to heat shock or other environmental stress by 
inducing the synthesis of proteins collectively known as heat-shock proteins (hsp) [1]. 
Amongst them is a family of proteins with an average molecular weight of 20 Kd, known as 
the hsp20 proteins [2 to 5]. These seem to act as chaperones that can protect other proteins 
against heat-induced denaturation and aggregation. Hsp20 proteins seem to form large 
heterooligomeric aggregates; their family is currently composed of the following members: - 
Vertebrate heat shock protein hsp27 (hsp25), induced by a variety of environmental stresses. 
- Drosophila heat shock proteins hsp22, hsp23, hsp26, hsp27, hsp67BA and BC. - 
Caenorhabditis elegans hspl6 multigene family. - Fungal HSP26 (budding yeast) and hsp30 
(Neurospora crassa and Aspergillus Nidulans). - Plant small hsp's. Plants have four classes of 
hsp20: classes I and II which are cytoplasmic, class III which is chloroplastic and class IV 
which is found in the endomembrane. - Alpha-crystallin A and B chains. Alpha-crystallin is 
an abundant constituent of the eye lens of most vertebrate species. Its main function appears 
to be to maintain the correct refractive index of the lens. It is also found in other tissues 
where it seems to act as a chaperone [6]. - Schistosoma mansoni major egg antigen p40. 
Structurally, p40 is built of two tandem hsp20 domains. - A variety of prokaryotic proteins: 
ibpA and ibpB from Escherichia coli, hspl8 from Clostridium acetobutylicum, spore protein 
SP21 (hsp A) from Stigmatella aurantiaca, Mycobacterium leprae 18 Kd antigen and 
Mycobacterium tuberculosis 14 Kd antigen. - Methanococcus jannaschii hypothetical protein 
MJ0285.Structurally, this family is characterized by the presence of a conserved C-terminal 
domain of about 100 residues. The profile developed to detect members of the hsp20 family 
is based on an alignment of this domain. 

-Sequences known to belong to this class detected by the profile: ALL. 
[ 1] Lindquist S., Craig E.A. Annu. Rev. Genet. 22:631-677(1988).[ 2] de Jong W.W., 
Leunissen J.A.M., Voorter C.E.M. Mol. Biol. Evol. 10:103-126(1993).[ 3] Caspers G.J., 
Leunissen J.A.M., de Jong W.W. J. Mol. Evol. 40:238-248(1995).[ 4] Jaenicke R., Creighton 
T.E. Curr. Biol. 3:234-235(1993).[ 5] Jakob U., Buchner J. Trends Biochem. Sci. 19:205- 



Reference No. 



2750-942P 



287 

211(1994).[ 6] Groenen P.J.T.A., Merck K.B., de Jong W.W., Bloemendal H. Eur. J. 
Biochem. 225:1-9(1994). 



282. Heat shock hsp70 proteins family signatures 

Prokaryotic and eukaryotic organisms respond to heat shock or other environmental 
stress by the induction of the synthesis of proteins collectively known as heat-shock proteins 
(hsp) [1]. Amongst them is a family of proteins with an average molecular weight of 70 Kd, 
known as the hsp70proteins [2,3,4]. In most species, there are many proteins that belong to 
the hsp70 family. Some of them are expressed under unstressed conditions. Hsp70proteins 
can be found in different cellular compartments (nuclear, cytosolic, mitochondrial, 
endoplasmic reticulum, etc.). Some of the hsp70 family proteinsare listed below: - In 
Escherichia coli and other bacteria, the main hsp70 protein is known as the dnaK protein. A 
second protein, hscA, has been recently discovered. dnaK is also found in the chloroplast 
genome of red algae. - In yeast, at least ten hsp70 proteins are known to exist: SSAl to SSA4, 
SSBl, SSB2, SSCl, SSDl (KAR2), SSEl (MSI3) and SSE2. - In Drosophila, there are at 
least eight different hspTO proteins: HSP70, HSP68, and HSC-1 to HSC-6. - In mammals, 
there are at least eight different proteins: HSPAl to HSPA6, HSC70, and GRP78 (also known 
as the immunoglobulin heavy chain binding protein (BiP)). - In the sugar beet yellow virus 
(SBYV), a hsp70 homolog has been shown [5] to exist. - In archaebacteria, hsp70 proteins 
are also present [6] .All proteins belonging to the hsp 70 family bind ATP. A variety of 
functions has been postulated for hsp70 proteins. It now appears [7] that some hsp7Gproteins 
play an important role in the transport of proteins across membranes. They also seem to be 
involved in protein folding and in the assembly/disassembly of protein complexes [8]. Three 
signature patterns for the hsp70 family of proteins were derived; the first centered on a 
conserved pentapeptide found in the N-terminal section of these proteins; the two others on 
conserved regions located in the central part of the sequence. 

Consensus pattern: [IV]-D-L-G-T-[ST]-x-[SC] - 

Consensus pattern: [LIVMF]-[LIVMFY]-[DN]-[LIVMFS]-G-[GSH]-[GS]-[AST]-x(3)- [ST]- 
[LIVM]-[LIVMFC]- 

Consensus pattern: [LIVMY]-x-[LIVMF]-x-G-G-x-[ST]-x-[LIVM]-P-x-[LIVM]-x- 
[DEQKRSTA]- 
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[ 1] Lindquist S., Craig E.A. Annu. Rev. Genet. 22:631-677(1988). 
[ 2] Pelham H.R.B. Cell 46:959-961(1986). 

[ 3] Pelham H.R.B. Nature 332:776-77(1988).[ 4] Craig E.A. BioEssays 11:48-52(1989). 
5 [ 5] Agranovsky A.A., Boyko V.P., Karasev A.V., Koonin E.V., Dolja V.V. J. Mol. Biol. 
217:603-610(1991). 

[ 6] Gupta R.S., Singh B. J. Bacteriol. 174:4594-4605(1992). 

[ 7] Deshaies R.J., Koch B.D., Schekmam R. Trends Biochem. Sci. 13:384-388(1988). 
[ 8] Craig E.A., Gross C.A. Trends Biochem. Sci. 16:135-140(1991). 

10 

283. Heat shock hsp90 proteins family signature 
1. Prokaryotic and eukaryotic organisms respond to heat shock or other environmental stress by 

^ the induction of the synthesis of proteins collectively known as heat-shock proteins (hsp) [1]. 

1 5 Amongst them is a family of proteins, with an average molecular vi^eight of 90 Kd, known as 
the hsp90proteins. Proteins known to belong to this family are: - Escherichia coli and other 
bacteria heat shock protein c62.5 (gene htpG). - Vertebrate hsp 90-alpha (hsp 86) and hsp 90- 
beta (hsp 84). - Drosophila hsp 82 (hsp 83). - Trypanosoma cruzi hsp 85. - Plants Hsp82 or 
Hsp83. - Yeast and other fungi HSC82, and HSP82. - The endoplasmic reticulum protein 
Z 2 0 'endoplasmin' (also known as Erp99 in mouse, GRP94 in hamster, and hsp 108 in 

chicken).The exact function of hsp90 proteins is not yet known. In higher eukaryotes, hsp90 
has been found associated with steroid hormone receptors, with tyrosine kinase oncogene 
products of several retroviruses, with eIF2alpha kinase, and with actin and tubulin. Hsp90 are 
probable chaperonins that possess ATPase activity [2,3]. As a signature pattern for the hsp90 
2 5 family of proteins, a highly conserved region found in the N-terminal part of these proteins 
was selected. 

Consensus pattern: Y-x-[NQH]-K-[DE]-[IVA]-F-L-R-[ED] - 

30 [1] Lindquist S., Craig E.A. Annu. Rev. Genet. 22:631-677(1988). 

[ 2] Nadeau K., Das A., Walsh C.T. J. Biol. Chem. 268:1479-1487(1993). 
[ 3] Jakob U., Buchner J. Trends Biochem. Sci. 19:205-211(1994). 
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284. Helix-turn-helix (HTH3) 

This large family of DNA binding helix-turn helix proteins includes Cro 
Swiss:PQ3036 and CI Swiss:P03034 . 



285. Heme oxygenase signature 

Heme oxygenase (EC 1.14.99.3 ) (HO) [1] is the microsomal enzyme that, in animals, carries 
out the oxidation of heme, it cleaves the heme ring at the alpha methane bridge to form 

1 0 biliverdin and carbon monoxide. Biliverdin is subsequently converted to bilirubin by 

biliverdin reductase. In mammals there are three isozymes of heme oxygenase: HO-1 to HO- 
3. The first two isozymes differ in their tissue expression and their inducibility: HO-1 is 
highly inducible by its substrate heme and by various non-heme substances, while HO-2 is 
non-inducible. It has been suggested [2] that HO-2 could be implicated in the production of 

15 carbon monoxide in the brain where it is said to act as a neurotransmitter. In the genome of 
the chloroplast of red algae as well as in cyanobacteria, there is a heme oxygenase (gene 
pbsA) that is the key enzyme in the synthesis of the chromophoric part of the photosynthetic 
antennae [3]. An heme oxygenase is also present in the bacteria Corynebacterium diphtheriae 
(gene hmuO), where it is involved in the acquisition of iron from the host heme [4] .There is, 

2 0 in the central section of these enzymes, a well conserved region centered on a histidine 

residue which is proposed to play a key role in binding the substrate heme at the active center 
of the enzyme. This region was used as a signature pattern. 



Consensus pattern: L-[IV]-A-H-[STACH]-Y-[STV]-[RT]-Y-[LIVM]-G [H binds the heme] - 

25 

[ 1] Maines M.D. FASEB J. 2:2557-2568(1988). 
[ 2] Barinaga M. Science 259:309-309(1993). 

[ 3] Richaud C, Zabulon G. Proc. Natl. Acad. Sci. U.S.A. 94:11736-11741(1997). 
[ 4] Schmitt M.P. J. Bacteriol. 179:838-845(1997). 



286. Hepatitis core antigen. 
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The core antigen of hepatitis viruses possesses a carboxyl 
terminus rich in arginine. On this basis it was predicted 
that the core antigen would bind DNA [1]. There is some 
experimental evidence to support this [2]. 

[1] Pasek M, Goto T, Gilbert W, Zink B, Schaller H, Mckay P, 
Leadbetter G, Murray K; Nature 1979;282:575-579. [2] 
Gallina A, Bonelli F, Zentilin L, Rindi G, Muttini M, 
Milanesi G; J Virol 1989;63:4645-4652. 



287. Histidine biosynthesis protein 

Proteins involved in steps 4 and 6 of the histidine biosynthesis pathway are contained 
in this family. Histidine is formed by several complex and distinct biochemical reactions 
1 5 catalysed by eight enzymes. The enzymes in this Pfam entry are called His6 and His? in 
eukaryotes and HisA and HisF in prokaryotes. 

[1] Fani R, Tamburini E, Mori E, Lazcano A, Lio P, Barberio C, Casalone E, 
Cavalieri D, Perito B, Polsinelli M, Gene 1997;197:9-17. [2] Fani R, Lio P, Chiarelli I, 
Bazzicalupo M, J Mol Evol 1994;38:489-495. 



288. Histone deacetylase family 

Histones can be reversibly acetylated on several lysine residues. Regulation of 
transcription is caused in part by this mechanism. Histone deacetylases catalyse the removal 
25 of the acetyl group. Histone deacetylases are related to other proteins [1]. 

Leipe DD, Landsman D, Nucleic Acids Res 1997;25:3693-3697. 



289. Histidinol dehydrogenase signature 
3 0 Histidinol dehydrogenase (EC 1.1.1.23 ) (HDH) catalyzes the terminal step in the biosynthesis 
of histidine in bacteria, fungi, and plants, the four-electron oxidation of L-histidinol to 
histidine. In bacteria HDH is a single chain polypeptide; in fungi it is the C-terminal domain 
of a multifunctional enzyme which catalyzes three different steps of histidine biosynthesis; 
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and in plants it is expressed as nuclear encoded protein precursor which is exported to the 
chloroplast [l].As a signature pattern a highly conserved region located in the central part of 
HDH was selected. This region does not correspond to the part of the enzyme that, in most, 
but not all HDH sequences contains a cysteine residue which, in Salmonella typhimurium, 
5 has been said [2] to be important for the catalytic activity of the enzyme. 

Consensus pattern: I-D-x(2)-A-G-P-[ST]-E-[LIVS]-[LIVMA](3)-[AC]-x(3)-A-x(4)- [LIVM]- 
[AV]-[SACL]-[DE]-[LIVMFC]-[LIVM]-[SA]-x(2)-E-H- 

10 [1] Nagai A., Ward E., Beck J., Tada S., Chang J.-Y., Scheidegger A., Ryals J. Proc. Natl. 
Acad. Sci. U.S.A. 88:4133-4137(1991). 

[ 2] Grubmeyer C.T., Gray W.R. Biochemistry 25:4778-4784(1986). 

1 5 290. Homoserine dehydrogenase signature 

Homoserine dehydrogenase (EC 1.1.1.3 ) (HDh) [1,2] catalyzes NAD-dependent reduction of 
aspartate beta-semialdehyde into homoserine. This reaction is the third step in a pathway 
leading from aspartate to homoserine. The latter participates in the biosynthesis of threonine 
and then i so leucine as well as in that of methionine. HDh is found either as a single chain 

2 0 protein as in some bacteria and yeast, or as a bifunctional enzyme consisting of an N-terminal 

aspartokinase domain and a C-terminal HDh domain as in bacteria such as Escherichia coli 
and in plants. As a signature pattern, the best conserved region of Hdh has been selected. This 
is a segment of 23 to 24 residues located in the central section and that contains two 
conserved aspartate residues. 

25 

Consensus pattern: A-x(3)-G-[LIVMFY]-[STAG]-x(2,3)-[DNS]-P-x(2)-D-[LIVM]-x-G- x- 
D-x(3)-K- 

[ 1] Thomas D., Barbey R., Surdin-Kerjan Y. FEBS Lett. 323:289-293(1993). 

3 0 [2] Cami B., Clepet C, Patte J.-C. Biochimie 75:487-495(1993). 



291. haloacid dehalogenase-like hydrolase 
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This family is structurally different from the alpha/ beta hydrolase family 
(abhydrolase). This family includes L-2-haloacid dehalogenase, epoxide hydrolases and 
phosphatases. The structure of the family consists of two domains. One is an inserted four 
helix bundle, which is the least well conserved region of the alignment, between residues 16 
and 96 of Swiss:P24Q69 . The rest of the fold is composed of the core alpha/beta domain. 
[1] Hisano T, Hata Y, Fujii T, Liu JQ, Kurihara T, Esaki N, Soda K, J Biol Chem 1996; 
271:20322-20330. 

292. DEAD and DEAH box families ATP-dependent helicases signatures (helicase_C) 
A number of eukaryotic and prokaryotic proteins have been characterized [1,2,3] on the basis 
of their structural similarity. They all seem to be involved in ATP-dependent, nucleic-acid 
unwinding. Proteins currently known to belong to this family are: - Initiation factor eIF-4A. 
Found in eukaryotes, this protein is a subunit of a high molecular weight complex involved in 
5'cap recognition and the binding of mRNA to ribosomes. It is an ATP-dependent RNA- 
helicase. - PRP5 and PRP28. These yeast proteins are involved in various ATP-requiring 
steps of the pre-mRNA splicing process. - PllO, a mouse protein expressed specifically 
during spermatogenesis. - An3, a Xenopus putative RNA helicase, closely related to PllO. - 
SPP81/DED1 and DBPl, two yeast proteins probably involved in pre-mRNA splicing and 
related to PllO. - Caenorhabditis elegans helicase glh-1. - MSS116, a yeast protein required 
for mitochondrial splicing. - SPB4, a yeast protein involved in the maturation of 25S 
ribosomal RNA. - p68, a human nuclear antigen. p68 has ATPase and DNA-helicase 
activities in vitro. It is involved in cell growth and division. - Rm62 (p62), a Drosophila 
putative RNA helicase related to p68. - DBP2, a yeast protein related to p68. - DHHl, a yeast 
protein. - DRSl, a yeast protein involved in ribosome assembly. - MAK5, a yeast protein 
involved in maintenance of dsRNA killer plasmid. - ROKl, a yeast protein. - stel3, a fission 
yeast protein. - Vasa, a Drosophila protein important for oocyte formation and specification 
of embryonic posterior structures. - MeSlB, a Drosophila maternally expressed protein of 
unknown function. - dbpA, an Escherichia coli putative RNA helicase. - deaD, an Escherichia 
coli putative RNA helicase which can suppress a mutation in the rpsB gene for ribosomal 
protein S2. - rhlB, an Escherichia coli putative RNA helicase. - rhlE, an Escherichia coli 
putative RNA helicase. - srmB, an Escherichia coli protein that shows RNA-dependent 
ATPase activity. It probably interacts with 23S ribosomal RNA. - Caenorhabditis elegans 
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hypothetical proteins T26G10.1, ZK512.2 and ZK686.2. - Yeast hypothetical protein 
YHR065C. - Yeast hypothetical protein YHR169w. - Fission yeast hypothetical protein 
SpAC31A2.07c. - Bacillus subtilis hypothetical protein yxiN. All these proteins share a 
number of conserved sequence motifs. Some of them are specific to this family while others 
are shared by other ATP-binding proteins or by proteins belonging to the helicases 
^superfamily' [4,E1]. One of these motifs, called the 'D-E-A-D-box', represents a special 
version of the B motif of ATP-binding proteins. Some other proteins belong to a subfamily 
which have His instead of the second Asp and are thus said to be 'D-E-A-H-box' proteins 
[3,5,6,E1]. Proteins currently known to belong to this subfamily are: - PRP2, PRP16, FRP22 
and PRP43. These yeast proteins are all involved in various ATP-requiring steps of the pre- 
mRNA splicing process. - Fission yeast prhl, which my be involved in pre-mRNA splicing. - 
Male-less (mle), a Drosophila protein required in males, for dosage compensation of X 
chromosome linked genes. - RAD3 from yeast. RAD3 is a DNA helicase involved in excision 
repair of DNA damaged by UV light, bulky adducts or cross-linking agents. Fission yeast 
radl5 (rhp3) and mammalian DNA excision repair protein XPD (ERCC-2) are the homologs 
of RAD3. - Yeast CHLl (or CTFl), which is important for chromosome transmission and 
normal cell cycle progression in G(2)/M. - Yeast TPSl. - Yeast hypothetical protein 
YKL078W. - Caenorhabditis elegans hypothetical proteins C06E1.10 and K03H1.2. - 
Poxviruses' early transcription factor 70 Kd subunit which acts with RNA polymerase to 
initiate transcription from early gene promoters. - 18, a putative vaccinia virus helicase. - 
hrpA, an Escherichia coli putative RNA helicase. Signature patterns were developed for both 
subfamilies. 

Consensus pattern: [LIVMF](2)-D-E-A-D-[RKEN]-x-[LIVMFYGSTN]- 

Consensus pattern: [GSAH]-x-[LIVMF](3)-D-E-[ALIV]-H-[NECR] - 

Note: proteins belonging to this family also contain a copy of the ATP/GTP- binding motif 

'A (P-loop) (see the relevant entry < PDOC00017 

[ 1] Schmid S.R., Linder P. Mol. Microbiol. 6:283-292(1992). 

[ 2] Linder P., Lasko P., Ashburner M., Leroy P., Nielsen P.J., Nishi K., Schnier J., Slonimski 
P.P. Nature 337:121-122(1989). 

[ 3] Wassarman D.A., Steitz J.A. Nature 349:463-464(1991). 

[ 4] Hodgman T.C. Nature 333:22-23(1988) and Nature 333:578-578(1988) (Errata). 
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[ 5] Harosh I., Deschavanne P. Nucleic Acids Res. 19:6331-6331(1991). 
[ 6] Koonin E.V., Senkevich T.G. J. Gen. Virol. 73:989-993(1992). 



293. Heme-binding domain in cytochrome b5 and oxidoreductases (heme_l) 

Cytochrome b5 is a membrane-bound hemo protein which acts as an electron carrier 
for several membrane-bound oxygenases [1]. There are two homologous forms of b5, one 
found in microsomes and one found in the outer membrane of mitochondria. Two conserved 
histidine residues serve as axial ligands for the heme group. The structure of a number of 
oxidoreductases consists of the juxtaposition of a heme-binding domain homologous to that 
of b5 and either a flavodehydrogenase or a molybdopterin domain. These enzymes are: 

- Lactate dehydrogenase (EC 1.1.2.3 ) [2], an enzyme that consists of a 
flavodehydrogenase domain and a heme-binding domain called cytochrome b2. 
Nitrate reductase (EC 1.6.6.1 ), a key enzyme involved in the first step of nitrate 
assimilation in plants, fungi and bacteria [3,4]. Consists of a molybdopterin 
domain (see < PDOC0Q484 >), a heme-binding domain called cytochrome b557, as 
well as a cytochrome reductase domain. 

- Sulfite oxidase (EC 1.8.3.1 ) [5], which catalyzes the terminal reaction in the 
oxidative degradation of sulfur-containing amino acids. Also consists of a 
molybdopterin domain and a heme-binding domain. 

This family of proteins also includes: 

- TU-36B, a Drosophila muscle protein of unknown function [6]. 
Fission yeast hypothetical protein SpAClF12.10c. 

- Yeast hypothetical protein YMR073c. 

- Yeast hypothetical protein YMR272c. 

A segment was used which includes the first of the two histidine heme ligands, as a 
signature pattern for the heme-binding domain of cytochrome b5 family. 

Consensus pattern: [FY]-[LIVMK]-x(2)-H-P-[GA]-G [H is a heme axial ligand]- 

[1] Ozols J. Biochim. Biophys. Acta 997:121-130(1989). 
[2] Guiard B. EMBO J. 4:3265-3272(1985). 
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[3] Calza R., Huttner E., Vincentz M., Rouze P., Galangau F., Vaucheret H., Cherel I., Meyer 

C, Kronenberger J., Caboche M. Mol. Gen. Genet. 209:552-562(1987). 

[4] Crawford N.M., Smith M., Bellissimo D., Davis R.W. Proc. Natl. Acad. Sci. U.S.A. 

85:5006-5010(1988). 

[5] Guiard B., Lederer F. Eur. J. Biochem. 100:441-453(1979). 

[6] Levin R.J., Boychuk P.L., Croniger C.M., Kazzaz J.A., Rozelc C.E. Nucleic Acids Res. 
17:6349-6367(1989). 

294. Hexapeptide-repeat containing-transferases signature 

On the basis of sequence similarity, a number of transferases have been proposed [1,2,3,4] to 
belong to a single family. These proteins are: - Serine acetyltransferase (EC 2.3.1.30 ) (SAT) 
(gene cysE), an enzyme involved in cysteine biosynthesis. - Azotobacter chroococcum 
nitrogen fixation protein nifP. NifP is most probably a SAT involved in the optimization of 
nitrogenase activity. - Escherichia coli thiogalactoside acetyltransferase (EC 2.3.1.18 ) (gene 
lacA), an enzyme involved in the biosynthesis of lactose. - UDP-N-acetylglucosamine 
acyltransferase (EC 2.3.1.129 ) (gene IpxA), an enzyme involved in the biosynthesis of lipid 
A, a phosphorylated glycolipid that anchors the lipopolysaccharide to the outer membrane of 
the cell. - UDP-3-0-[3-hydroxymyristoyl] glucosamine N-acyl transferase (EC 2.3.1.-) (gene 
IpxD or firA), which is also involved in the biosynthesis of lipid A. - Chloramphenicol 
acetyltransferase (CAT) (EC 2.3.1.28 ) from Agrobacterium tumefaciens, Bacillus sphaericus, 
Escherichia coli plasmid IncFII NR79, Pseudomonas aeruginosa. Staphylococcus aureus 
plasmid pIP630. These CAT are not evolutionary related to the main family of CAT (see 
< PDOC00093 >). - Rhizobium nodulation protein nodL. NodL is an acetyltransferase 
involved in the O-acetylation of Nod factors. - Bacterial maltose O-acetyltransferase (EC 
2.3.1.79 ). - Bacterial tetrahydrodipicolinate N-succinyltransferase (EC 2.3.1.117 ) (gene 
dapD) which catalyzes the fourth step in the biosynthesis of diaminopimelate and lysine from 
aspartate semialdehyde. - Bacterial N-acetylglucosamine-l-phosphate uridyltransferase (EC 
2.7.7.23 ) (gene glmU or gcaD or tms), an enzyme involved in peptidoglycan and 
lipopolysaccharide biosynthesis. - Staphylococcus aureus protein capG which is involved in 
biosynthesis of type 1 capsular polysaccharide. - Yeast hypothetical protein YJL218w, which 
is highly similar to Escherichia coli lacA. - Fission yeast hypothetical protein 
SpAClSBl 1.09c. - Methanococcus jannaschii hypothetical protein MJ1064.These proteins 
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have been shown [3,4] to contain a repeat structure composed of tandem repeats of a [LIV]- 
G-x(4) hexapeptide which, in the tertiary structure of IpxA [5], has been shown to form a left- 
handed parallel beta helix. Our signature pattern is based on a fourfold repeat of this 
hexapeptide. 

Consensus pattern: [LIV]-[GAED]-x(2)-[STAV]-x-[LIV]-x(3)-[LIVAC]-x-[LIV]- [GAED]- 
x(2)-[STAVR]-x-[LIV]-[GAED]-x(2)-[STAV]-x-[LIV]-x(3)-[LIV]- 

[ 1] Downie J.A. Mol. Microbiol. 3:1649-1651(1989). 

[ 2] Parent R., Roy P.H. J. Bacteriol. 174:2891-2897(1992). 

[ 3] Vaara M. FEMS Microbiol. Lett. 97:249-254(1992). 

[ 4] Vuorio R., Haerkonen T., Tolvanen M., Vaara M. FEES Lett. 337:289-292(1994). 
[ 5] Raetz C.R.H., Roderick S.L. Science 270:997-1000(1995). 

295. Hexokinases signature. Hexokinase (EC 2.7. LI ) [1,2] is an important glycolytic enzyme 
that catalyzes the phosphorylation of keto- and aldohexoses (e.g. glucose, mannose and 
fructose) using MgATP as the phosphoryl donor. In vertebrates there are four major 
isoenzymes, commonly referred as types III and IV. Type IV hexokinase, which is often 
incorrectly designated glucokinase [3], is only expressed in liver and pancreatic beta-cells 
and plays an important role in modulating insulin secretion; it is a protein of a molecular 
mass of about 50 Kd. Hexokinases of types I to III, which have low Km values for glucose, 
have a molecular mass of about 100 Kd. Structurally they consist of a very small N-terminal 
hydrophobic membrane-binding domain followed by two highly similar domains of 450 
residues. The first domain has lost its catalytic activity and has evolved into a regulatory 
domain. In yeast there are three different isozymes: hexokinase PI (gene HXKl), PII(gene 
HXKB), and glucokinase (gene GLKl). All three proteins have a molecular mass of about 50 
Kd. All these enzymes contain one (or two in the case of types I to III isozymes)strongly 
conserved region which has been shown [4] to be involved in substrate binding. A pattern 
from that region has been derived 



Consensus pattern: [LIVM]-G-F-[TN]-F-S-[FY]-P-x(5)-[LIVM]-[DNST]-x(3)-[LIVM]- x(2)- 
W-T-K-x-[LF]- 
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[ 1] Middleton RJ. Biochem. Soc. Trans. 18:180-183(1990).[ 2] Griffin L.D., Gelb B.D., 
Wheeler D.A., Davison D., Adams V., McCabe E.R. Genomics 11:1014-1024(1991).[ 3] 
Cornish-Bowden A., Luz Cardenas M. Trends Biochem. Sci. 16:281-282(1991).[ 4] Schirch 
D.M., Wilson J.E. Arch. Biochem. Biophys. 254:385-396(1987). 

296. Histone H2A signature (hisl) 

Histone H2A is one of the four histones, along with H2B, H3 and H4, which forms the 
eukaryotic nucleosome core. Using alignments of histone H2Asequences [1,2,E1] as a 
signature pattern, a conserved region in the N-terminal part of H2A. This region is conserved 
both in classical S-phase regulated H2A's and in variant histone H2A's which are synthesized 
throughout the cell cycle. 

Consensus pattern: [AC]-G-L-x-F-P-V- 

[ 1] Wells D.E., Brown D. Nucleic Acids Res. 19:2173-2188(1991). 

[ 2] Thatcher T.H., Gorovsky M.A. Nucleic Acids Res. 22:174-179(1994). 

Histone H4 signature (his2) 

Histone H4 is one of the four histones, along with H2A, H2B and H3, which forms 
the eukaryotic nucleosome core. Along with H3, it plays a central role in nucleosome 
formation. The sequence of histone H4 has remained almost invariant in more then 2 billion 
years of evolution [1,E1]. The region used as a signature pattern is a pentapeptide found in 
positions 14 to 18 of all H4sequences. It contains a lysine residue which is often acetylated 
[2] and a histidine residue which is implicated in DNA-binding [3]. 

Consensus pattern: G-A-K-R-H- 

[ 1] Thatcher T.H., Gorovsky M.A. Nucleic Acids Res. 22:174-179(1994). 

[ 2] Doenecke D., Gallwitz D. Mol. Cell. Biochem. 44:113-128(1982). 

[ 3] Ebralidse K.K., Grachev S.A., Mirzabekov A.D. Nature 331:365-367(1988). 
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Histone H3 signatures (his3) 

Histone H3 is one of the four iiistones, along with H2A, H2B and H4, which forms the 
eukaryotic nucleosome core. It is a highly conserved protein of 135 amino acid residues 
[l,2,El].The following proteins have been found to contain a C-terminal H3-like domain: - 
5 Mammalian centromeric protein CENP-A [3]. Could act as a core histone necessary for the 
assembly of centromeres. - Yeast chromatin-associated protein CSE4 [4]. - Caenorhabditis 
elegans chromosome III encodes two highly related proteins (F54C8.2 and F58A4.3) whose 
C-terminal section is evolutionary related to the last 100 residues of H3. The function of these 
proteins is not yet known. Two signature patterns were developed, The first one corresponds 
10 to a perfectly conserved heptapeptide in the N-terminal part of H3. The second one is derived 
from a conserved region in the central section of H3. 

Consensus pattern: K-A-P-R-K-Q-L- 

Consensus pattern: P-F-x-[RA]-L-[VA]-[KRQ]-[DEG]-[IV]- 

15 

[ 1] Wells D.E., Brown D. Nucleic Acids Res. 19:2173-2188(1991). 

[ 2] Thatcher T.H., Gorovsky M.A. Nucleic Acids Res. 22:174-179(1994). 

[ 3] Sullivan K.F., Hechenberger M., Masri K. J. Cell Biol. 127:581-592(1994). 

[ 4] Stoler S., Keith K.C., Curnick K.E., Fitzgerald-Hayes M. Genes Dev. 9:573-586(1995). 

20 

Histone H2B signature (his4) 

Histone H2B is one of the four histones, along with H2A, H3 and H4, which forms 
the eukaryotic nucleosome core. Using alignments of histone H2Bsequences [1,2,E1], a 
conserved region was selected in the C-terminal part ofH2B. 

25 

Consensus pattern: [KR]-E-[LIVM]-[EQ]-T-x(2)-[KR]-x-[LIVM](2)-x-[PAG]-[DE]-L- x- 
[KR]-H-A-[LIVM]-[STA]-E-G- 

[ 1] Wells D.E., Brown D. Nucleic Acids Res. 19:2173-2188(1991). 
3 0 [2] Thatcher T.H., Gorovsky M.A. Nucleic Acids Res. 22:174-179(1994). 

297. 'Homeobox' domain signature and profile (homel) 
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The 'homeobox' is a protein domain of 60 amino acids [1 to 5,E1] first identified in a number 
of Drosophila homeotic and segmentation proteins. It has since been found to be extremely 
well conserved in many other animals, including vertebrates. This domain binds DNA 
through a helix-turn-helix type of structure. Some of the proteins which contain a homeobox 
5 domain play an important role in development. Most of these proteins are known to be 

sequence specific DNA-binding transcription factors. The homeobox domain has also been 
found to be very similar to a region of the yeast mating type proteins. These are sequence- 
specific DNA-binding proteins that act as master switches in yeast differentiation by 
controlling gene expression in a cell type-specific fashion. A schematic representation of the 
1 0 homeobox domain is shown below. The helix-turn-helix region is shown by the symbols 'H' 
(for helix), and 't' (for turn). 

xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxHHHHHHHHtttHHHHHHHHHxxxxxxxxxx 1 1 | | 1 1 I 1 
10 20 30 40 50 60 The pattern to detect homeobox sequences that was developed is 24 
residues long and spans positions 34 to 57 of the homeobox domain. 

15 

Consensus pattern: [LI VMFYG] - [ ASLVR] -x(2)- [LI VMSTACN] -x-[LIVM] -x(4)-[LIV] - 
[RKNQESTAIY]-[LIVFSTNKH]-W-[FYVC]-x-[NDQTAH]-x(5)- [RKNAIMW] - 

[ 1] Gehring W.J. (In) Guidebook to the homebox genes, Duboule D., Ed., ppl-10, Oxford 
2 0 University Press, Oxford, (1994). 

[ 2] Buerglin T.R. (In) Guidebook to the homebox genes, Duboule D., Ed., pp25-72, Oxford 
University Press, Oxford, (1994). 

[ 3] Gehring W.J. Trends Biochem. Sci. 17:277-280(1992). 
[ 4] Gehring W.J., Hiromi Y. Annu. Rev. Genet. 20:147-173(1986). 
2 5 [5] Schofield P.N. Trends Neurosci. 10:3-6(1987). 

'Homeobox' antennapedia-type protein signature (home2) 

The homeotic Hox proteins are sequence-specific transcription factors. They are part of a 
developmental regulatory system that provides cells with specific positional identities on the 
30 anterior-posterior (A-P) axis [1]. The hox proteins contain a 'homeobox' domain. In 

Drosophila and other insects, there are eight different Hox genes that are encoded in two gene 
complexes, ANT-C and BX-C. In vertebrates there are 38 genes organized in four complexes. 
In six of the eight Drosophila Hox genes the homeobox domain is highly similar and a 
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conserved hexapeptide is found five to sixteen amino acids upstream of the homeobox 
domain. The six Drosophila proteins that belong to this group are antennapedia (Antp), 
abdominal-A (abd-A), deformed (Dfd), proboscipedia (pb),sex combs reduced (scr) and 
ultrabithorax (ubx) and are collectively known as the 'antennapedia' subfamily. In vertebrates 
the corresponding Hox genes are known [2] as Hox-A2, A3, A4,A5, A6, A7, Hox-Bl, B2, 
B3, B4, B5, B6, B7, B8, Hox-C4, C5, C6, C8, Hox-Dl,D3, D4 and DS.Caenorhabditis 
elegans lin-39 and mab-5 are also members of the 'antennapedia' subfamily. As a signature 
pattern for this subfamily of homeobox proteins, the conserved hexapeptide was used. 

Consensus pattern: [LIVMFE]-[FY]-P-W-M-[KRQTA]- 

[ 1] McGinnis W., Krumlauf R. Cell 68:283-302(1992). 
[ 2] Scott M.P. Cell 71:551-553(1992). 

'Homeobox' engrailed-type protein signature (home3) 

Most proteins which contain a 'homeobox' domain can be classified [1,2], on the basis 
of their sequence characteristics, in three subfamilies: engrailed, antennapedia and paired. 
Proteins currently known to belong to the engrailed subfamily are: - Drosophila segmentation 
polarity protein engrailed (en) which specifies the body segmentation pattern and is required 
for the development of the central nervous system. - Drosophila invected protein (inv). - Silk 
moth proteins engrailed and invected, which may be involved in the compartmentalization of 
the silk gland. - Honeybee E30 and E60. - Grasshopper (Schistocerca americana) G-En. - 
Mammalian and birds En-1 and En-2. - Zebrafish Eng-1, -2 and -3. - Sea urchin (Tripneusteas 
gratilla) SU-HB-en. - Leech (Helobdella triserialis) Ht-En. - Caenorhabditis elegans ceh- 
16. Engrailed homeobox proteins are characterized by the presence of a conserved region of 
some 20 amino-acid residues located at the C-terminal of the 'homeobox' domain. As a 
signature pattern for this subfamily of proteins, a stretch of eight perfectly conserved residues 
in this region was used. 

Consensus pattern: L-M-A-[EQ]-G-L-Y-N- 

[ 1] Scott M.P., Tamkun J.W., Hartzell G.W. Ill Biochim. Biophys. Acta 989:25-48(1989). 
[ 2] Gehring W.J. Science 236:1245-1252(1987). 
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298. Isocitrate lyase signature (ICL) 

Isocitrate lyase (EC 4.1.3.1 ) [1,2] is an enzyme that catalyzes the conversion of isocitrate to 
succinate and glyoxylate. This is the first step in the glyoxylate bypass, an alternative to the 
tricarboxylic acid cycle in bacteria, fungi and plants. A cysteine, a histidine and a glutamate 
or aspartate have been found to be important for the enzyme's catalytic activity. Only one 
cysteine residue is conserved between the sequences of the fungal, plant and bacterial 
enzymes; it is located in the middle of a conserved hexapeptide that can be used as a 
signature pattern for this type of enzyme. 

Consensus pattern: K-[KR]-C-G-H-[LMQ] [C is a putative active site residue]- 
[ 1] Beeching J.R. Protein Seq. Data Anal. 2:463-466(1989). 

[ 2] Atomi H., Ueda M., Hikida M., Hishida T., Teranishi Y., Tanaka A. J. Biochem. 
107:262-266(1990). 

299. Initiation factor 2 subunit 

This family includes initiation factor 2B alpha, beta and delta subunits from 
eukaryotes, related proteins from archaebacteria and IF-2 from prokaryotes. Initiation factor 
2 binds to Met-tRNA, OTP and the small ribosomal subunit. 

[1] Kyrpides NC, Woese CR, Proc Natl Acad Sci U S A 1998;95:3726-3730. 

300. Initiation factor 3 signature 

Initiation factor 3 (IF-3) (gene infC) [1] is one of the three factors required for the initiation 
of protein biosynthesis in bacteria. IF-3 is thought to function as a fidelity factor during the 
assembly of the ternary initiation complex which consist of the 30S ribosomal subunit, the 
initiator tRNA and the messenger RNA. lF-3 binds to the 30S ribosomal subunit; it is a basic 
protein of 141 to 212 residues. The chloroplast initiation factor IF-3(chl) is a protein that 
enhances the poly(A,U,G)-dependent binding of the initiator tRNA to chloroplast 
ribosomaBOs subunits. In its mature form it is a protein of about 400 residues whose central 
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section is evolutionary related to the sequence of bacterial IF-3 [2] .As a signature pattern a 
highly conserved region was selected located in the central section of bacterial IF-3 and of 
IF-3(chl). 

5 Consensus pattern: [KR]-[LIVM](2)-[DN]-[FY]-[GSN]-[KR]-[LIVMFYS]-x-[FY]- 
[DEQTH]-x(2)-[KRQ]- 

[ 1] Liveris D., Schwartz JJ., Geertman R., Schwartz I. FEMS Microbiol. Lett. 112:211- 
216(1993). 

10 [2] Lin Q., Ma L., Burkhart W., Spremulli L.L. J. Biol. Chem. 269:9436-9444(1994). 

301. Imidazoleglycerol-phosphate dehydratase signatures (IGPD) 
Imidazoleglycerol-phosphate dehydratase (EC 4.2.1.19 ) is the enzyme that catalyzes the 

1 5 seventh step in the biosynthesis of histidine in bacteria, fungi and plants. In most organisms it 
is a monofunctional protein of about 22 to29 Kd. In some bacteria such as Escherichia coli it 
is the C-terminal domain of a bifunctional protein that include a histidinol-phosphatase 
domain [1]. Two signature patterns were developed that each include two consecutive 
histidine residues. 

20 

Consensus pattern: [LIVMY]-[DE]-x-H-H-x(2)-E-x(2)-[GCA]-[LIVM]-[STAC]-[LIVM]- 
Consensus pattern: G-x-[DN]-x-H-H-x(2)-E-[STAGC]-x-[FY]-K - 

[ 1] Carlomagno M.S., Chiariotti L., Alifano P., Nappo A.G., Bruni C.B. J. Mol. Biol. 
25 203:585-606(1988). 

302. Indole-3-glycerol phosphate synthase signature ( IGPS) 

Indole-3-glyceroi phosphate synthase (EC 4.1.1.48 ) (IGPS) catalyzes the fourth step in the 
30 biosynthesis of tryptophan: the ring closure of l-(2-carboxy-phenylamino)-l-deoxyribulose 
into indol-3-glycerol-phosphate.In some bacteria, IGPS is a single chain enzyme. In others - 
such as Escherichia coli - it is the N-terminal domain of a bifunctional enzyme that also 
catalyzes N-(5'-phosphoribosyl)anthranilate isomerase (PRAI) activity, the third step of 
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tryptophan biosynthesis. In fungi, IGPS is the central domain of a trifunctional enzyme that 
also contains a PRAI C-terminal domain and a glutamine amidotransferase N-terminal 
domain. The N-terminal section of IGPS contains a highly conserved region which X-ray 
crystallography studies [1] have shown to be part of the active site cavity. This region was 
5 used as a signature pattern for IGPS. 

Consensus pattern: [LIVMFY]-[LIVMC]-x-E-[LIVMFYC]-K-[KRSP]-[STAK]-S-P-[ST]- 
x(3)-[LIVMFYST]- 

10 [1] Wilmanns M., Priestle J.P., Niermann T., Jansonius J.N. J. Mol. Biol. 223:477- 
507(1992). 

303. (IL2) Interleukin 2. 31 members 

15 

304. (ILVD EDD) Dihydroxy-acid and 6-phosphogluconate dehydratases. Two dehydratases 
have been shown [1] to be evolutionary related: - Dihydroxy-acid dehydratase (EC 4.2.1.9 ) 
(gene ilvD or ILV3) which catalyzes the fourth step in the biosynthesis of isoleucine and 

2 0 valine, the dehydratation of 2,3-dihydroxy-isovaleic acid into alpha-ketoisovaleric acid. - 6- 
phosphogluconate dehydratase (EC 4.2.1.12) (gene edd) which catalyzes the first step in the 
Entner-Doudoioff pathway, the dehydratation of 6-phospho- D-gluconate into 6-phospho-2- 
dehydro-3-deoxy-D-gluconate. - Escherichia coli hypothetical protein yjhG. Both enzymes 
are proteins of about 600 amino acid residues. Two highly conserved regions have been 

2 5 developed as signature patterns. The first pattern is located in the N-terminal part and 

contains a cysteine that could be involved in the binding of a 2Fe-2S iron-sulfur cluster [2]. 
The second pattern is located in the C-terminal half. 

Consensus pattern: C-D-K-x(2)-P-[GA]-x(3)-[GA] [The C could be a 2Fe-2S ligand] 

3 0 Consensus pattern: [SA]-L-[LIVM]-T-D-[GA]-R-[LIVMF]-S-[GA]-[GAV]-[ST]- 
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[ 1] Egan S.E., Fliege R., Tong S., Shibata A., Wolf R.E. Jr., Conway T. J. Bacteriol. 
174:4638-4646(1992).[ 2] Velasco J.A., Cansado J., Pena M.C., Kawakami T., Laborda J., 
Notario V. Gene 137:179-185(1993). 

5 

305. IMP dehydrogenase / GMP reductase signature 

IMP dehydrogenase (EC 1.1.1.205 ) (IMPDH) catalyzes the rate -limiting reaction of de novo 
OTP biosynthesis, the NAD-dependent reduction of IMP into XMP [1]. Inhibition of IMP 
dehydrogenase activity results in the cessation of DNA synthesis. As IMP dehydrogenase is 

1 0 associated with cell proliferation, it is a possible target for cancer chemotherapy. Mammalian 
and bacterial IMPDHs are tetramers of identical chains. There are two IMP dehydrogenase 
isozymes in humans [2]. GMP reductase (EC 1.6.6.8 ) catalyzes the irreversible and NADPH- 
dependent reductive deamination of GMP into IMP [3]. It converts nucleobase, nucleoside 
and nucleotide derivatives of G to A nucleotides, and maintains intracellular balance of A and 

15 G nucleotides. IMP dehydrogenase and GMP reductase share many regions of sequence 

similarity. One of these regions is centered on a cysteine residue thought [3] to be involved in 
binding IMP. This region was used as a signature pattern. 

Consensus pattern: [LIVM]-[RK]-[LIVM]-G-[LIVM]-G-x-G-S-[LIVM]-C-x-T [C is the 
2 0 putative IMP-binding residue] - 

[ 1] Collart F.R., Huberman E. J. Biol. Chem. 263:15769-15772(1988). 

[ 2] Natsumeda Y., Ohno S., Kawasaki H., Konno Y., Weber G., Suzuki K. J. Biol. Chem. 

265:5292-5295(1990). 

2 5 [3] Andrews S.C., Guest J.R. Biochem. J. 255:35-43(1988). 

306. (IPPc) Inositol polyphosphate phosphatase family, catalytic domain 

3 0 [1] York JD, Ponder JW, Chen ZW, Mathews FS, Majerus PW; 

Biochemistry 1994;33:13164-13171. [2] Jefferson AB, Auethavekiat V, Pot DA, Williams 
LT, Majerus PW; J Biol Chem 1997;272:5983-5988. [3] Zhang X, Jefferson AB, 
Auethavekiat V, Majerus PW; Proc Natl Acad Sci U S A 1995;92:4853-4856. [4] York JD, 
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Majerus PW. Proc Natl Acad Sci U S A 1990;87:9548-9552. [5] Neuwald AF, York JD, 
Majerus PW; 

FEES Lett 1991;294:16-18. 

5 

307. 10 calmodulin-binding motif 

[1] Xie X, Harrison DH, Schlichting I, Sweet RM, Kalabokis VN, 
Szent-Gyorgyi AG, Cohen C; Nature 1994;368:306-312. 
10 [2] Rhoads AR, Friedberg F; FASEB J 1997;11:331-340. 

308. Inosine-uridine preferring nucleoside hydrolasefamily signature (lU nuc hydro) 
Inosine-uridine preferring nucleoside hydrolase (EC 3.2.2.1 ) (lU-nucleosidehydrolase or 

15 lUNH) is an enzyme first identified in protozoan [1] that catalyzes the hydrolysis of all of the 
commonly occuring purine and pyrimidine nucleosides into ribose and the associated base, 
but has a preference for inosine and uridine as substrates. This enzyme is important for these 
parasitic organisms, which are deficient in de novo synthsis of purines, to salvage the host 
purine nucleosides. lUNH from Crithidia fasciculata has been sequenced and characterized, it 

2 0 is an homotetrameric enzyme of subunits of 34 Kd. An histidine has been shown to be 

important for the catalytic mechanism, it acts a proton donor to activate the hypoxanthine 
leaving group. lUNH is evolutionary related to a number of uncharacterized proteins from 
various biological sources, notably: - Escherichia coli hypothetical protein yaaF. - 
Escherichia coli hypothetical protein ybeK. - Escherichia coli hypothetical protein yeiK. - 

2 5 Fission yeast hypothetical protein SpAC17G8.02. - Yeast hypothetical protein YDR400w. - 
An hypothetical protein from the archaebacteria Desulfurolobus ambivalens. As a signature 
pattern for these proteins, a highly conserved region was selected located in the N-terminal 
extremity. This region contains four conserved aspartates that have been shown [2] to be 
located in the active site cavity. 

30 

Consensus pattern: D-x-D-[PT]-[GA]-x-D-D-[TAV]-[VI]-A - 
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[ 1] Gopaul D.N., Meyer S.L., Degano M., Sacchettini J.C., Schramm V.L. Biochemistry 
35:5963-5970(1996). 

[ 2] Degano M., Gopaul D.N., Scapin G., Schramm V.L., Sacchettini J.C. Biochemistry 
35:5971-5981(1996). 

5 

309. (Insulinase) 

Insulinase family, zinc-binding region signature 
(aka Peptidase_M16) 

10 

A number of proteases dependent on divalent cations for their activity have been shown [1,2] 
to belong to one family, on the basis of sequence similarity. These enzymes are listed below. 

- Insulinase (EC 3,4.24.56) (also known as insulysin or insulin-degrading enzyme or IDE), a 
1 5 cytoplasmic enzyme which seems to be involved in the cellular processing of insulin, 

glucagon and other small polypeptides. 

- Escherichia coli protease III (EC 3.4.24.55) (pitrilysin) (gene ptr), a periplasmic enzyme 
that degrades small peptides. 

- Mitochondrial processing peptidase (EC 3.4.24.64) (MPP). This enzyme removes the 

2 0 transit peptide from the precursor form of proteins imported from the cytoplasm across the 
mitochondrial inner membrane. It is composed of two nonidentical homologous subunits 
termed alpha and beta. The beta subunit seems to be catalytically active while the alpha 
subunit has probably lost its activity. 

- Nardilysin (EC 3.4.24.61) (N-arginine dibasic convertase or NRD convertase) this 

2 5 mammalian enzyme cleaves peptide substrates on the N-terminus of Arg residues in dibasic 

stretches. 

- BQebsiella pneumoniae protein pqqF. This protein is required for the biosynthesis of the 
coenzyme pyrrolo-quinoline-quinone (PQQ). It is thought to be protease that cleaves peptide 
bonds in a small peptide (gene pqqA) thus providing the glutamate and tyrosine residues 

3 0 necessary for the synthesis of PQQ. 

- Yeast protein AXLl, which is involved in axial budding [3]. 

- Eimeria bovis sporozoite developmental protein. 
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- Escherichia coli hypothetical protein yddC and HI1368, the corresponding Haemophilus 
influenzae protein. 

- Bacillus subtilis hypothetical protein ymxG. 

- Caenorhabditis elegans hypothetical proteins C28F5.4 and F56D2.1. 

5 

It should be noted that in addition to the above enzymes, this family also includes the core 
proteins I and II of the mitochondrial bcl complex (also called cytochrome c reductase or 
complex III), but the situation as to the activity or lack of activity of these subunits is quite 
complex: 

10 

- In mammals and yeast, core proteins I and II lack enzymatic activity. 

- In Neurospora crassa and in potato core protein I is equivalent to the beta subunit of MPP. 

- In Euglena gracilis, core protein I seems to be active, while subunit II is inactive. 

1 5 These proteins do not share many regions of sequence similarity; the most noticeable is in the 
N-terminal section. This region includes a conserved histidine followed, two residues later by 
a glutamate and another histidine. In pitrilysin, it has been shown [4] that this H-x-x-E-H 
motif is involved in enzyme activity; the two histidines bind zinc and the glutamate is 
necessary for catalytic activity. Non active members of this family have lost from one to three 

2 0 of these active site residues. We developed a signature pattern that detect active members of 
this family as well as some inactive members. 

Consensus pattern G-x(8,9)-G-x-[STA]-H-[LIVMFY]-[LIVMC]-[DERN]-[HRICL]- 
[LMFAT]-x-[LFSTH]-x-[GSTAN]-[GST] [The two H are zinc ligands] [E is the active site 
2 5 residue] Sequences known to belong to this class detected by the pattern ALL active 

members as well as all MPP alpha subunits and core II subunits. Does not detect inactive core 
I subunits. 

Note: these proteins belong to family M16 in the classification of peptidases [5]. 
30 □ 

[ 1] Rawlings N.D., Barrett A.J. Biochem. J. 275:389-391(1991). 

□ 
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[ 2] Braun H.-P., Schmitz U.K. Trends Biochem. Sci. 20:171-175(1995). 
[ 3] Becker A.B., Roth R.A. Proc. Natl. Acad. Sci. U.S.A. 89:3835-3839(1992). 
[ 4] Fujita A., Oka C, Arikawa Y., Katagai T., Tonouchi A., Kuhara S., Misumi Y. Nature 
372:567-570(1994). 
5 [ 5] Rawlings N.D., Barrett A.J. Meth. Enzymol. 248:183-228(1995). 

310. Involucrin repeat 

Eckert RL, Yaffe MB, Crish JF, Murthy S, Rorke EA, Welter JF, J Invest Dermatol 
10 1993;100:613-617. 

311. Isochorismatase family. This family are hydrolase enzymes. 

15 Romao MJ, Turk D, Gomis-Ruth FX, Huber R, Schumacher G, Mollering H, Russmann L, J 
Mol Biol 1992;226:1111-1130. 

312. Inositol monophosphatase family signatures (inositol_P) 

2 0 It has been shown [1] that several proteins share two sequence motifs. Two of these proteins 

are enzymes of the inositol phosphate second messenger signaling pathway: - Vertebrate and 
plants inositol monophosphatase (EC 3.1.3.25 ). - Vertebrate inositol polyphosphate 1- 
phosphatase (EC 3.1.3.57 ).The function of the other proteins is not yet clear: - Bacterial 
protein cysQ. CysQ could help to control the pool of PAPS (3'-phosphoadenoside 5'- 
25 phosphosulfate), or be useful in sulfite synthesis. - Escherichia coli protein suhB. Mutations 
in suhB results in the enhanced synthesis of heat shock sigma factor (htpR). - Neurospora 
crassa protein Qa-X. Probably involved in quinate metabolism. - Emericella nidulans protein 
qutG. Probably involved in quinate metabolism. - Yeast protein HAL2/MET22 [2] involved 
in salt tolerance as well as methionine biosynthesis. - Yeast hypothetical hypothetical protein 

3 0 YHR046c. - Caenorhabditis elegans hypothetical protein F13G3.5. - A Rhizobium 

leguminosarum hypothetical protein encoded upstream of the pss gene for exopolysaccharide 
synthesis. - Methanococcus jannaschii hypothetical protein MJ0109.It is suggested [1] that 
these proteins may act by enhancing the synthesis or degradation of phosphorylated 
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messenger molecules. From the X-ray structure of human inositol monophosphatase [3], it 
seems that some of the conserved residues are involved in binding a metal ion and/or the 
phosphate group of the substrate. 

5 Consensus pattern: [FWV]-x(04)-[LIVM]-D-P-[LIVM]-D-[SG]-[ST]-x(2)-[FY]-x- 
[HKRNSTY] [The first D and the T bind a metal ion]- 

Consensus pattern: [WV]-D-x-[AC]-[GSA]-[GSAPV]-x-[LIVACP]-[LIV]-[LIVAC]-x(3)- 
[GH]-[GA]- 

10 [1] Neuwald A.F., York J.D., Majerus P.W. FEES Lett. 294:16-18(1991). 

[ 2] Glaeser H.-U., Thomas D., Gaxiola R., Montrichard F., Surdin-Kerjan Y., Serrano R. 
EMBO J. 12:3105-3110(1993). 

[ 3] Bone R., Springer J.P., Atack J.R. Proc. Natl. Acad. Sci. U.S.A. 89:10031-10035(1992). 

15 

313. Ion transport protein 

This family contains Sodium, Potassium, Calcium ion channel This family is 6 
transmembrane helices in which the last two helices flank a loop which determines ion 
selectivity. In some sub-families (e.g. Na channels) the domain is repeated four times, 
2 0 whereas in others (e.g. K channels) the protein forms as a tetramer in the membrane. A 

bacterial structure of the protein is known for the last two helices but is not the Pfam family 
due to it lacking the first four helices 

2 5 314. Isocitrate and isopropylmalate dehydrogenases signature (isodh) 

Isocitrate dehydrogenase (IDH) [1,2] is an important enzyme of carbohydrate metabolism 
which catalyzes the oxidative decarboxylation of isocitrate into alpha-ketoglutarate. IDH is 
either dependent on NAD+ (EC 1.1.1.41 ) or on NADP+(EC 1.1.1.42 ). In eukaryotes there are 
at least three isozymes of IDH: two are located in the mitochondrial matrix (one NAD-i-- 

3 0 dependent, the other NADP+-dependent), while the third one (also NADP+-dependent) is 

cytoplasmic. In Escherichia coli the activity of a NADP+-dependent form of the enzyme is 
controlled by the phosphorylation of a serine residue; the phosphorylated form of IDH is 
completely inactivated. 3-isopropylmalate dehydrogenase (EC 1.1. 1.85 ) (IMDH) [3,4] 
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catalyzes the third step in the biosynthesis of leucine in bacteria and fungi, the oxidative 
decarboxylation of 3-isopropylmalate into 2-oxo-4-methylvalerate. Tartrate dehydrogenase 
(EC 1.1.1.93 ) [5] catalyzes the reduction of tartrate to oxaloglycolate. These enzymes are 
evolutionary related [1,3,4,5]. The best conserved region of these enzymes is a glycine-rich 
5 stretch of residues located in the C-terminal section. This region was used as a signature 
pattern. 



Consensus pattern: [NS]-[LIMYT]-[FYDN]-G-[DNT]-[IMVY]-x-[STGDN]-[DN]-x(2)- 
[SGAP]-x(3,4)-G-[STG]-[LIVMPA]-G-[LIVMF]- 

10 

[ 1] Hurley J.H., Thorsness P.E., Ramalingam V., Helmers N.H., Koshland D.E. Jr., Stroud 
R.M. Proc. Natl. Acad. Sci. U.S.A. 86:8635-8639(1989). 
[ 2] Cupp J.R., McAlister-Henn L. J. Biol. Chem. 266:22199-22205(1991). 
[ 3] Imada K., Sato M., Tanaka N., Katsube Y., Matsuura Y., Oshima T. J. Mol. Biol. 
15 222:725-738(1991). 

[ 4] Zhang T., Koshland D.E. Jr. Protein Sci. 4:84-92(1995). 

[ 5] Tipton P.A., Beecher B.S. Arch. Biochem. Biophys. 313:15-21(1994). 



2 0 315. Jacalin-like lectin domain. 



Proteins containing this domain are lectins. It is found in 

1 to 6 copies in these proteins. The domain is also found in the animal prostatic spermine- 
binding protein ( Swiss: PI 5501 ). 

[1] Sankaranarayanan R, Sekar K, Banerjee R, Sharma V, Surolia 
A, Vijayan M; Nat Struct Biol 1996;3:596-603. 



3 0 316. KH domain 

KH motifs probably bind RNA directly. Auto antibodies to Nova, a KH domain 
protein, cause paraneoplastic opsoclonus ataxia. 
[1] Burd CG, Dreyfuss G, Science 1994;265:615-621. 
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[2] Musco G, Stier G, Joseph C, Castiglione Morelli MA, Nilges M, Gibson TJ, Pastore A, 
Cell 1996;85:237-245. 



5 317. Kelch motif 

The kelch motif was initially discovered in Kelch ( Swiss:Q04652 ). In this protein 
there are six copies of the motif. It has been shown that Swiss :QQ4652 is related to Galactose 
Oxidase [1] for which a structure has been solved [2]. The kelch motif forms a beta sheet. 
Several of these sheets associate to form a beta propeller structure as found in neur, 
1 0 [1] Bork P, Doolittle RF, J Mol Biol 1994;236:1277-1282. [2] Ito N, Phillips SE, 

Stevens C, Ogel ZB, McPherson MJ, Keen, JN, Yadav KD, Knowles PF, Nature 
1991;350:87-90. 



15 318. Soybean trypsin inhibitor (Kunitz) protease inhibitors family signature 

The soybean trypsin inhibitor (Kunitz) family [1] is one of the numerous families of 
proteinase inhibitors. It comprise plant proteins which have inhibitory activity against serine 
proteinases from the trypsin and subtilisin families, thiol proteinases and aspartic proteinases 
as well as some proteins that are probably involved in seed storage. This family is currently 

2 0 known to group the following proteins: - Trypsin inhibitors A, B, C, KTIl, and KTI2 from 
soybean. - Trypsin inhibitor DE3 from coral beans (Erythrina sp.). - Trypsin inhibitor DE5 
from sandal bead tree. - Trypsin inhibitors lA (WTI-IA), IB (WTI-IB), and 2 (WTI-2) from 
goa bean. - Trypsin inhibitor from Acacia confusa. - Trypsin inhibitor from silk tree. - 
Chymotrypsin inhibitor 3 (WCI-3) from goa bean. - Cathepsin D inhibitors PDI and NDI 

2 5 from potato [2], which inhibit both cathepsin D (aspartic proteinase) and trypsin. - Alpha- 

amylase/subtilisin inhibitors from barley and wheat. - Albumin-1 (WBA-1) from goa bean 
seeds [3]. - Miraculin from Richadella dulcifica [4], a sweet taste protein. - Sporamin from 
sweet potato [5], the major tuberous root protein. - Thiol proteinase inhibitor PCPI 8.3 (P340) 
from potato tuber [6]. - Wound responsive protein gwin3 from poplar tree [7]. - 21 Kd seed 

3 0 protein from cocoa [8] .All these proteins contain from 170 to 200 amino acid residues and 

one or twointrachain disulfide bonds. The best conserved region is found in their N-terminal 
section and is used as a signature pattern 
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Consensus pattern: [LIVM]-x-D-x-[EDNTY]-[DG]-[RKHDENQ]-x-[LIVM]-x(5)-Y-x- 
[LIVM] - 

[ 1] Laskowski M., Kato I. Annu. Rev. Biochem. 49:593-626(1980). 
5 [ 2] Ritonja A., Krizaj I., Mesko P., Kopitar M., Lucovnik P., Strukelj B., Pungercar J., Buttle 
D.J., Barrett A.J., Turk V. FEBS Lett. 267:13-15(1990). 
[ 3] Kortt A.A., Strike P.M., de Jersey J. Eur. J. Biochem. 181:403-408(1989). 
[ 4] Theerasilp S., Hitotsuya H., Nakajo S., Nakaja K., Nakamura Y., Kurihara Y. J. Biol. 
Chem. 264:6655-6659(1989). 
10 [5] Hattori T., Yoshida N., Nakamura K. Plant Mol. Biol. 13:563-572(1989). 

[ 6] Krizaj I., Drobnic-Kosorok M., Brzin J., Jerala R., Turk V. FEBS Lett. 333:15-20(1993). 
[ 7] Bradshaw H.D., Hollick J.B., Parsons T.J., Clarke H.R.G., Gordon M.P. Plant Mol. Biol. 
14:51-59(1989). 

[ 8] Tai H., McHenry L., Fritz P.J., Furtek D.B. Plant Mol. Biol. 16:913-915(1991). 

15 



319. Beta-ketoacyl synthases active site 

Beta-ketoacyl-ACP synthase (KAS) [1] is the enzyme that catalyzes the condensation of 
malonyl-ACP with the growing fatty acid chain. It is found as a component of the following 

2 0 enzymatic systems: - Fatty acid synthetase (FAS), which catalyzes the formation of long- 
chain fatty acids from acetyl-CoA, malonyl-CoA and NADPH. Bacterial and plant 
chloroplast FAS are composed of eight separate subunits which correspond to different 
enzymatic activities; beta-ketoacyl synthase is one of these polypeptides. Fungal FAS 
consists of two multifunctional proteins, FASl and FAS2; the beta-ketoacyl synthase domain 

25 is located in the C-terminal section of FAS2. Vertebrate FAS consists of a single 

multifunctional chain; the beta-ketoacyl synthase domain is located in the N-terminal section 
[2]. - The multifunctional 6-methysalicylic acid synthase (MS AS) from Penicillium patulum 
[3]. This is a multifunctional enzyme involved in the biosynthesis of a polyketide antibiotic 
and which has a KAS domain in its N-terminal section. - Polyketide antibiotic synthase 

30 enzyme systems. Polyketides are secondary metabolites produced by microorganisms and 

plants from simple fatty acids. KAS is one of the components involved in the biosynthesis of 
the Streptomyces polyketide antibiotics granatacin [4], tetracenomycin C [5] and 
erythromycin. - Emericella nidulans multifunctional protein Wa. Wa is involved in the 
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biosynthesis of conidial green pigment. Wa is protein of 216 Kd that contains a KAS domain. 
- Rhizobium nodulation protein nodE, which probably acts as a beta-ketoacyl synthase in the 
synthesis of the nodulation Nod factor fatty acyl chain. - Yeast mitochondrial protein CEMl. 
The condensation reaction is a two step process: the acyl component of an activated acyl 
primer is transferred to a cysteine residue of the enzyme and is then condensed with an 
activated malonyl donor with the concomitant release of carbon dioxide. The sequence 
around the active site cysteine is well conserved and can be used as a signature pattern. 

Consensus pattern: G-x(4)-[LIVMFAP]-x(2)-[AGC]-C-[STA](2)-[STAG]-x(3)-[LIVMF] [C 
is the active site residue] 

[ 1] Kauppinen S., Siggaard-Andersen M., von Wettstein-Knowles P. Carlsberg Res. 
Commun. 53:357-370(1988). 

[ 2] Witkowski A., Rangan V.S., Randhawa Z.I., Amy CM., Smith S. Eur. J. Biochem. 
198:571-579(1991). 

[ 3] Beck J., Ripka S., Siegner A., Schiltz E., Schweizer E. Eur. J. Biochem. 192:487- 
498(1990). 

[ 4] Bibb M.J., Biro S., Motamedi H., Collins J.F., Hutchinson C.R. EMBO J. 8:2727- 
2736(1989). 

[ 5] Sherman D.H., Malpartida P., Bibb M.J., Kieser H.M., Bibb M.J., Hopwood D.A. EMBO 
J. 8:2717-2725(1989). 

320. Kinesin motor domain signature and profile 

Kinesin [1,2,3] is a microtubule-associated force-producing protein that mayplay a role in 
organelle transport. Kinesin is an oligomeric complex composedof two heavy chains and two 
light chains. The kinesin motor activity isdirected toward the microtubule's plus end.The 
heavy chain is composed of three structural domains: a large globular N-terminal domain 
which is responsible for the motor activity of kinesin (it isknown to hydrolyze ATP, to bind 
and move on microtubules), a central alpha-helical coiled coil domain that mediates the 
heavy chain dimerization; and asmall globular C-terminal domain which interacts with other 
proteins (such asthe kinesin light chains), vesicles and membranous organelles.A number of 
proteins have been recently found that contain a domain similarto that of the kinesin 'motor' 
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domain [1,4,E1]: - Drosophila claret segregational protein (ncd). Ned is required for normal 
chromosomal segregation in meiosis, in females, and in early mitotic divisions of the embryo. 
The ncd motor activity is directed toward the microtubule's minus end. - Drosophila kinesin- 
like protein (nod). Nod is required for the distributive chromosome segregation of 
nonexchange chromosomes during meiosis. - Human CENP-E [4], CENP-E is a protein that 
associates with kinetochores during chromosome congression, relocates to the spindle 
midzone at anaphase, and is quantitatively discarded at the end of the cell division. CENP-E 
is probably an important motor molecule in chromosome movement and/ or spindle 
elongation. - Human mitotic kinesin-like protein-1 (MKLP-1), a motor protein whose activity 
is directed toward the microtubule's plus end. - Yeast KAR3 protein, which is essential for 
yeast nuclear fusion during mating. KAR3 may mediate microtubule sliding during nuclear 
fusion and possibly mitosis. - Yeast CIN8 and KIPl proteins which are required for the 
assembly of the mitotic spindle. Both proteins seem to interact with spindle microtubules to 
produce an outwardly directed force acting upon the poles. - Fission yeast cut7 protein, which 
is essential for spindle body duplication during mitotic division. - Emericella nidulans bimC, 
which plays an important role in nuclear division. - Emericella nidulans klpA. - 
Caenorhabditis elegans unc-104, which may be required for the transport of substances 
needed for neuronal cell differentiation. - Caenorhabditis elegans osm-3. - Xenopus Eg5, 
which may be involved in mitosis. - Arabidopsis thaliana KatA, KatB and katC. - 
Chlamydomonas reinhardtii FLAIO/KHPI and KLPl. Both proteins seem to play a role in 
the rotation or twisting of the microtubules of the flagella. - Caenorhabditis elegans 
hypothetical protein T09A5.2.The kinesin motor domain is located in the N-terminal part of 
most of theabove proteins, with the exception of KAR3, kip A, and ncd where it is locatedin 
the C-terminal section.The kinesin motor domain contains about 330 amino acids. An ATP- 
binding motifof type A is found near position 80 to 90, the C-terminal half of the domainis 
involved in microtubule-binding. The signature pattern for that domain isderived from a 
conserved decapeptide inside the microtubule-binding part. 

Consensus pattern: [GSA]-[KRHPSTQVM]-[LIVMF]-x-[LIVMF]-[IVC]-D-L-[AH]-G- 
[SAN]-E 

[ 1] Bloom G.S., Endow S.A. Protein Prof. 2:1109-1171(1995). 

[ 2] Vallee R.B., Shpetner H.S. Annu. Rev. Biochem. 59:909-932(1990). 
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[ 3] Brady S.T. Trends Cell Biol. 5:159-164(1995). 

[ 4] Endow S.A. Trends Biochem. Sci. 16:221-225(1991).[E1] 

5 321. Ribosomal protein L15 signature 

Ribosomal protein L15 is one of the proteins from the large ribosomal subunit. In Escherichia 
coli, L15 is known to bind the 23S rRNA. It belongs to a family of ribosomal proteins which, 
on the basis of sequence similarities [1], groups: - Eubacterial LI 5. - Plant chloroplast L15 
(nuclear-encoded). - Archaebacterial L15. - Vertebrate L27a. - Tetrahymena thermophila 
10 L29. - Fungi L27a (L29, CRP-1, CYH2).L15 is a protein of 144 to 154 amino-acid residues. 
As a signature pattern, a conserved region was selected in the C-terminal section of these 
proteins. 

Consensus pattern: K-[LIVM](2)-[GASL]-x-[GT]-x-[LIVMA]-x(2,5)-[LIVM]-x- [LIVMF]- 
1 5 x(3,4)-[LIVMFCA]-[ST]-x(2)-A-x(3)-[LIVM]-x(3)-G 

[ 1] Otaka E., Hashimoto T., Mizuta K., Suzuki K. Protein Seq. Data Anal. 5:301-313(1993). 

20 322. LBP / BPI / CETP family signature 

The following mammalian lipid-binding serum glycoproteins belong to the same family 
[1,2,3]: - Lipopolysaccharide-binding protein (LBP). LBP binds to the lipid A moiety of 
bacterial lipopolysaccharides (LPS), a glycolipid present in the outer membrane of all Gram- 
negative bacteria. The LBP/LPS complex seems to interact with the CD14 receptor and may 

25 be responsible for the secretion of alpha-TNF. - Bactericidal permeability-increasing protein 
(BPI). Like LBP, BPI binds LPS and has a cytotoxic activity on Gram-negative bacteria. - 
Cholesteryl ester transfer protein (CETP). CETP is involved in the transfer of insoluble 
cholesteryl esters in reverse cholesterol transport. - Phospholipid transfer protein (PLTP). 
May play a key role in extracellular phospholipid transport and modulation of HDL particles. 

3 0 These proteins are structurally related and share many regions of sequencesimilarities. As a 
signature pattern one of these regions was selected, which is located in the N-terminal section 
of these proteins; a region which could be involved in the binding to the lipids [2]. 
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Consensus pattern: [PA]-[GA]-[LIVMC]-x(2)-R-[IV]-[ST]-x(3)-L-x(5)-[EQ]-x(4)- [LIVM]- 
[EQK]-x(8)-P 

[ 1] Schumann R.R., Leong S.R., Flaggs G.W., Gray P.W., Wright S.D., Mathison J.C, 
Tobias P.S., Ulevitch R.J. Science 249:1429-1431(1990). 

[ 2] Gray P.W., Flaggs G., Leong S.R., Gumina R.J., Weiss J., Ooi C.E., Elsbach P. J. Biol. 
Chem. 264:9505-9509(1989). 

[ 3] Day J.R., Albers J.J., Lofton-Day C.E., Gilbert T.L., Ching A.F.T., Grant F.J., O'Hara 
P.J., Marcovina S.M., Adolphson J.L. J. Biol. Chem. 269:9388-9391(1994). 

323. LIM domain signature and profile 

Recently [1,2] a number of proteins have been found to contain a conserved cysteine-rich 
domain of about 60 amino-acid residues. These proteins are: - Caenorhabditis elegans mec-3; 
a protein required for the differentiation of the set of six touch receptor neurons in this 
nematode. - Caenorhabditis elegans lin-11; a protein required for the asymmetric division of 
vulval blast cells. - Vertebrate insulin gene enhancer binding protein isl-1. Isl-1 binds to one 
of the two cis-acting protein-binding domains of the insulin gene. - Vertebrate homeobox 
proteins lim-1, lim-2 (lim-5) and lim3. - Vertebrate lmx-1, which acts as a transcriptional 
activator by binding to the FLAT element; a beta-cell-specific transcriptional enhancer found 
in the insulin gene. - Mammalian LH-2, a transcriptional regulatory protein involved in the 
control of cell differentiation in developing lymphoid and neural cell types. - Drosophila 
protein apterous, required for the normal development of the wing and halter imaginal discs. - 
Vertebrate protein kinases LIMK-1 and LIMK-2. - Mammalian rhombotins. Rhombotin 1 
(RBTNl or TTG-1) and rhombotin-2 (RBTN2 or TTG-2) are proteins of about 160 amino 
acids whose genes are disrupted by chromosomal translocations in T-cell leukemia. - 
Mammalian and avian cysteine-rich protein (CRP), a 192 amino-acid protein of unknown 
function. Seems to interact with zyxin. - Mammalian cysteine-rich intestinal protein (CRIP), 
a small protein which seems to have a role in zinc absorption and may function as an 
intracellular zinc transport protein. - Vertebrate paxillin, a cytoskeletal focal adhesion protein. 
- Mouse testin. Mouse testin should not be confused with rat testin which is a thiol protease 
homolog. - Sunflower pollen specific protein SF3. - Chicken zyxin. Zyxin is a low-abundance 
adhesion plaque protein which has been shown to interact with CRP. - Yeast protein LRGl 
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which is involved in sporulation [4]. - Yeast rho-type GTPase activating protein 
RGAl/DBMl. - Caenorhabditis elegans homeobox protein ceh-14. - Caenorhabditis elegans 
homeobox protein unc-97. - Yeast hypothetical protein YKR090w. - Caenorhabditis elegans 
hypothetical proteins C28H8.6.These proteins generally have two tandem copies of a domain, 
called LIM (forLin-11 Isl-1 Mec-3) in their N-terminal section. Zyxin and paxillin 
areexceptions in that they contains respectively three and four LIM domains attheir C- 
terminal extremity. In apterous, isl-1, LH-2, lin-11, lim-1 to lim-3,lmx-l and ceh-14 and mec- 
3 there is a homeobox domain some 50 to 95 amino acids after theLIM domains.In the LIM 
domain, there are seven conserved cysteine residues and ahistidine. The arrangement 
followed by these conserved residues is C-x(2)-C-x(16,23)-H-x(2)-[CH]-x(2)-C-x(2)-C- 
x(16,21)-C-x(2,3)-[CHD]. The LIM domainbinds two zinc ions [5]. LIM does not bind DNA, 
rather it seems to act asinterface for protein-protein interaction. A pattern was developed that 
spans the first half of the LIM domain. 

Consensus pattern: C-x(2)-C-x(15,21)-[FYWH]-H-x(2)-[CH]-x(2)-C-x(2)-C-x(3)- [LIVMF] 
[The 5 C's and the H bind zinc] 

[ 1] Freyd G., Kim S.K., Horvitz H.R. Nature 344:876-879(1990). 

[ 2] Baltz R., Evrard J.-L., Domon C, Steinmetz A. Plant Cell 4:1465-1466(1992). 

[ 3] Sanchez-Garcia I., Rabbitts T.H. Trends Genet. 10:315-320(1994). 

[ 4] Mueller A., Xu G., Wells R., HoUenberg C.P., Piepersberg W. Nucleic Acids Res. 

22:3151-3154(1994). 

[ 5] Michelsen J.W., Schmeichel K.L., Beckerle M.C., Winge D.R. Proc. Natl. Acad. Sci. 
U.S.A. 90:4404-4408(1993). 

324. (LRR) Leucine Rich Repeat 

CAUTION: This Pfam may not find all Leucine Rich Repeats in a protein. Leucine Rich 
Repeats are short sequence motifs present in a number of proteins with diverse functions and 
cellular locations. These repeats are usually involved in protein-protein interactions. Each 
Leucine Rich Repeat is composed of a beta-alpha unit. These units form elongated non- 
globular structures. Leucine Rich Repeats are often flanked by cysteine rich domains. 
Number of members: 3017 
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[1] The leucine-rich repeat: a versatile binding motif. Kobe B, Deisenhofer J; Trends 
Biochem Sci 1994;19:415-421. [2] Crystal structure of porcine ribonuclease inhibitor, a 
protein with leucine-rich repeats. Kobe B, Deisenhofer J; Nature 1993;366:751-756. 

325. Plant lipid transfer protein family signature (LTP) 

Plant cells contain proteins, called lipid transfer proteins (LTP) [1,2,3], which are able 
to facilitate the transfer of phospholipids and other lipidsacross membranes. These proteins, 
whose subcellular location is not yet known, could play a major role in membrane biogenesis 
by conveying phospholipids such as waxes or cutin from their site of biosynthesis to 
membranes unable to form these lipids. Plant LTP's are proteins of about 9 Kd (90 amino 
acids) which contain eight conserved cysteine residues all involved in disulfide bridges, as 
shown in the following schematic representation. 

+ ^,^......^11111*.**.*.********. 

xCxxxxCxxxxxxCCxxxxxxxxCxCxxxxxxxxxxxCxxxxxxCxx | | | | + 1 + | +— 



'C: conserved cysteine involved in a disulfide bond. 
'*': position of the pattern. 

Consensus pattern: [LIVM]-[PA]-x(2)-C-x-[LIVM]-x-[LIVM]-x-[LIVMFY]-x-[LIVM]- 
[ST]-x(3)-[DN]-C-x(2)-[LIVM] [The two C's are involved in disulfide bonds] 

[1] Wirtz K.W.A. Annu. Rev. Biochem. 60:73-99(1991). 
[2] Arondel V., Kader J.C. Experientia 46:579-585(1990). 

[3] Ohlrogge J.B., Browse J., Somerville C.R. Biochim. Biophys. Acta 1082:1-26(1991). 

326. (LAMP) Lysosome-associated membrane glycoproteins signatures 
Lysosome-associated membrane glycoproteins (lamp) [1] are integral membrane proteins, 
specific to lysosomes, and whose exact biological function is not yet clear. Structurally, the 
lamp proteins consist of two internally homologous lysosome-luminal domains separated by 
a proline-rich hinge region; at the C-terminal extremity there is a transmembrane region 
followed by a very short cytoplasmic tail. In each of the duplicated domains, there are two 
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conserved disulfide bonds. This structure is scliematically represented in the figure below. +-- 

-+ +---+ M 1 1 1 i M 

xCxxxxxCxxxxxxxxxxxxCxxxxxCxxxxxxxxxCxxxxxCxxxxxxxxxxxxCxxxxxCxxxxxxxx <- 

xHingex - - ><T]V[><C> In mammals, there are two 

closely related types of lamp: lamp-1 and lamp-2. In chicken Iamp-1 is known as 
LEPlOO.The macrophage protein CD68 (or macrosialin) [2] is a heavily glycosylatedintegral 
membrane protein whose structure consists of a mucin-like domain followed by a proline-rich 
hinge; a single lamp-like domain; a transmembrane region and a short cytoplasmic tail. Two 
signature patterns for this family of proteins were developed. The first oneis centered on the 
first conserved cysteine of the duplicated domains. The second corresponds to a region that 
includes the extremity of the second domain, the totality of the transmembrane region and the 
cytoplasmic tail. 

Consensus pattern: [STA]-C-[LIVM]-[LIVMFYW]-A-x-[LIVMFYW]-x(3)-[LIVMFYW]- 
x(3)-Y [C is involved in a disulfide bond] - 

Consensus pattern: C-x(2)-D-x(3,4)-[LIVM](2)-P-[LIVM]-x-[LIVM]-G-x(2)-[LIVM]- x-G- 
[LIVM](2)-x-[LIVM](4)-A-[FY]-x-[LIVM]-x(2)-[KR]-[RH]-x(l,2)-[STAG](2)-Y-[EQ] [C 
is involved in a disulfide bond] 

[ 1] Fukuda M. J. Biol. Chem. 266:21327-21330(1991). 

[ 2] Holness C.L., da Silva R.P., Fawcett J., Gordon S., Simmons D.L. J. Biol. Chem. 
268:9661-9666(1993). 

327. Lipolytic enzymes "G-D-S-L" family, serine active site 

Recently [1], a family of lipolytic enzymes has been characterized. This family 
currently consist of the following proteins: 

- Aeromonas hydrophila lipase/phosphatidylcholine-sterol acyltransferase. 

- Xenorhabdus luminescens lipase 1. 

- Vibrio mimicus arylest erase. 

- Escherichia coli acyl-coA thioesterase I (gene tesA). 

- Vibrio parahaemolyticus thermolabile hemolysin/atypical phospholipase. 
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- Rabbit phospholipase AdRab-B, an intestinal brusli border protein with esterase and 
phospholipase A/lysophospholipase activity that could be involved in the uptake of dietary 
lipids. AdRab-B contains four repeats of about 320 amino acids. 

- 7\rabidopsis thaliana and Brassic napus anther-specific proline-rich protein APG. 

- A Pseudomonas putida hypothetical protein in trpE-trpG intergenic region. A serine has 
been identified a part of the active site in the Aeromonas, Vibrio mimicus and Escherichia 
coli enzymes. It is located in a conserved sequence motif that can be used as a signature 
pattern for these proteins. 

-Consensus pattern: [LIVMFYAG](4)-G-D-S-[LIVM]-x(l,2)-[TAG]-G 
[S is the active site residue] 

328. (Lipoprotein 4) Prokaryotic membrane lipoprotein lipid attachment site 
In prokaryotes, membrane lipoproteins are synthesized with a precursor signal peptide, which 
is cleaved by a specific lipoprotein signal peptidase (signalpeptidase II). The peptidase 
recognizes a conserved sequence and cuts upstreamof a cysteine residue to which a glyceride- 
fatty acid lipid is attached [l].Some of the proteins known to undergo such processing 
currently include (forrecent listings see [1,2,3]): - Major outer membrane lipoprotein 
(murein-lipoproteins) (gene Ipp). - Escherichia coli lipoprotein-28 (gene nlpA). - Escherichia 
coli lipoprotein-34 (gene nlpB). - Escherichia coli lipoprotein nlpC. - Escherichia coli 
lipoprotein nlpD. - Escherichia coli osmotically inducible lipoprotein B (gene osmB). - 
Escherichia coli osmotically inducible lipoprotein E (gene osmE). - Escherichia coli 
peptidoglycan-associated lipoprotein (gene pal). - Escherichia coli rare lipoproteins A and B 
(genes rplA and rplB). - Escherichia coli copper homeostasis protein cutF (or nlpE). - 
Escherichia coli plasmids traT proteins. - Escherichia coli Col plasmids lysis proteins. - A 
number of Bacillus beta-lactamases. - Bacillus subtilis periplasmic oligopeptide-binding 
protein (gene oppA). - Borrelia burgdorferi outer surface proteins A and B (genes ospA and 
ospB). - Borrelia hermsii variable major protein 21 (gene vmp21) and 7 (gene vmp7). - 
Chlamydia trachomatis outer membrane protein 3 (gene omp3). - Fibrobacter succinogenes 
endoglucanase ceI-3. - Haemophilus influenzae proteins Pal and Pep. - Klebsiella pullulunase 
(gene pulA). - Klebsiella pullulunase secretion protein pulS. - Mycoplasma hyorhinis protein 
p37. - Mycoplasma hyorhinis variant surface antigens A, B, and C (genes vlpABC). - 



Reference No. 



2750-942P 



321 

Neisseria outer membrane protein H.8. - Pseudomonas aeruginosa lipopeptide (gene IppL). - 
Pseudomonas solanacearum endoglucanase egl. - Rhodopseudomonas viridis reaction center 
cytochrome subunit (gene cytC). - Rickettsia 17 Kd antigen. - Shigella flexneri invasion 
plasmid proteins mxiJ and mxiM. - Streptococcus pneumoniae oligopeptide transport protein 
A (gene amiA). - Treponema pallidium 34 Kd antigen. - Treponema pallidium membrane 
protein A (gene tmpA). - Vibrio harveyi chitobiase (gene chb). - Yersinia virulence plasmid 
protein yscJ. - Halocyanin from Natrobacterium pharaon is [4], a membrane associated 
copper- binding protein. This is the first archaebacterial protein known to be modified in such 
a fashion).From the precursor sequences of all these proteins, a consensus pattern and a set of 
rules to identify this type of post-translational modification was derived. 

Consensus pattern: {DERK}(6)-[LIVMFWSTAG](2)-[LIVMFYSTAGCQ]-[AGS]-C [C is 
the lipid attachment site] Additional rules: 1) The cysteine must be between positions 15 and 
35 of the sequence in consideration. 2) There must be at least one Lys or one Arg in the first 
seven positions of the sequence. 

[ 1] Hayashi S., Wu H.C. J. Bioenerg. Biomembr. 22:451-471(1990). 
[ 2] Klein P., Somorjai R.L., Lau P.C.K. Protein Eng. 2:15-20(1988). 
[ 3] von Heijne G. Protein Eng. 2:531-534(1989). 

[ 4] Mattar S., Scharf B., Kent S.B.H., Rodewald K., Oesterhelt D., Engelhard M. J. Biol. 
Chem. 269:14939-14945(1994). 

329. (Lopoprotein 5) Prokaryotic membrane lipoprotein lipid attachment site. In prokaryotes, 
membrane lipoproteins are synthesized with a precursor signal peptide, which is cleaved by a 
specific lipoprotein signal peptidase (signal peptidase II). The peptidase recognizes a 
conserved sequence and cuts upstream of a cysteine residue to which a glyceride-fatty acid 
lipid is attached [l].Some of the proteins known to undergo such processing currently include 
(for recent listings see [1,2,3]): - Major outer membrane lipoprotein (murein-lipoproteins) 
(gene Ipp). - Escherichia coli lipoprotein-28 (gene nlpA). - Escherichia coli lipoprotein-34 
(gene nlpB). - Escherichia coli lipoprotein nlpC. - Escherichia coli lipoprotein nlpD. - 
Escherichia coli osmotically inducible lipoprotein B (gene osmB). - Escherichia coli 
osmotically inducible lipoprotein E (gene osmE). - Escherichia coli peptidoglycan-associated 
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lipoprotein (gene pal). - Escherichia coli rare lipoproteins A and B (genes rplA and rplB). - 
Escherichia coli copper homeostasis protein cutF (or nlpE). - Escherichia coli plasmids traT 
proteins. - Escherichia coli Col plasmids lysis proteins. - A number of Bacillus beta- 
lactamases. - Bacillus subtilis periplasmic oligopeptide-binding protein (gene oppA). - 
Borrelia burgdorferi outer surface proteins A and B (genes ospA and ospB). - Borrelia 
hermsii variable major protein 21 (gene vmp21) and 7 (gene vmp7). - Chlamydia trachomatis 
outer membrane protein 3 (gene omp3). - Fibrobacter succinogenes endoglucanase cel-3. - 
Haemophilus influenzae proteins Pal and Pep. - Klebsiella pullulunase (gene pulA). - 
Klebsiella pullulunase secretion protein pulS. - Mycoplasma hyorhinis protein p37. - 
Mycoplasma hyorhinis variant surface antigens A, B, and C (genes vlp ABC). - Neisseria 
outer membrane protein H.8. - Pseudomonas aeruginosa lipopeptide (gene IppL). - 
Pseudomonas solanacearum endoglucanase egl. - Rhodopseudomonas viridis reaction center 
cytochrome subunit (gene cytC). - Rickettsia 17 Kd antigen. - Shigella flexneri invasion 
plasmid proteins mxiJ and mxiM. - Streptococcus pneumoniae oligopeptide transport protein 
A (gene amiA). - Treponema pallidium 34 Kd antigen. - Treponema pallidium membrane 
protein A (gene tmpA). - Vibrio harveyi chitobiase (gene chb). - Yersinia virulence plasmid 
protein yscJ. - Halocyanin from Natrobacterium pharaonis [4], a membrane associated 
copper- binding protein. This is the first archaebacterial protein known to be modified in such 
a fashion). From the precursor sequences of all these proteins, a consensus pattern and a set of 
rules to identify this type of post-translational modification have been developed. 

Consensus pattern: {DERK}(6)-[LIVMFWSTAG](2)-[LIVMFYSTAGCQ]-[AGS]-C [C is 
the lipid attachment site] Additional rules: 1) The cysteine must be between positions 15 and 
35 of the sequence in consideration. 2) There must be at least one Lys or one Arg in the first 
seven positions of the sequence. 

[ 1] Hayashi S., Wu H.C. J. Bioenerg. Biomembr. 22:451-471(1990).[ 2] Klein P., Somorjai 
R.L., Lau P.C.K. Protein Eng. 2:15-20(1988).[ 3] von Heijne G. Protein Eng. 2:531- 
534(1989).[ 4] Mattar S., Scharf B., Kent S.B.H., Rodewald K., Oesterhelt D., Engelhard M. 
J. Biol. Chem. 269:14939-14945(1994). 

330. (Lum binding) Riboflavin synthase alpha chain family Lum-binding site signature 
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The following proteins have been shown [1,2] to be structurally and evolutionary related: - 
Riboflavin synthase alpha chain (RS-alpha) (gene ribC in Escherichia coli, ribB in Bacillus 
subtilis and Photobacterium leiognathi, RIB5 in yeast). This enzyme synthesizes riboflavin 
from two moles of 6,7- dimethyl-8-(l'-D-ribityl)lumazine (Lum), a pteridine-derivative. - 
Photobacterium phosphoreum lumazine protein (LumP) (gene luxL). LumP is a protein that 
modulates the color of the bioluminescence emission of bacterial luciferase. In the presence 
of LumP, light emission is shifted to higher energy values (shorter wavelength). LumP binds 
non-covalently to 6,7-dimethyl-8-(l'-D-ribityl) lumazine. - Vibrio fischeri yellow fluorescent 
protein (YFP) (gene luxY). Like LumP, YFP modulates light emission but towards a longer 
wavelength. YFP binds non-covalently to FMN. These proteins seem to have evolved from 
the duplication of a domain of aboutlOO residues. In its C-terminal section, this domain 
contains a conserved motif [KR]-V-N-[LI]-E which has been proposed to be the binding site 
for Lum. RS-alpha which binds two molecules of Lum has two perfect copies of this motif, 
while LumP which binds one molecule of Lum, has a Glu instead of Lys/Arg in the first 
position of the second copy of the motif. Similarily, YFP, which binds to one molecule of 
FMN, also seems to have a potentially dysfunctional binding site by substitution of Gly for 
Glu in the last positionof the first copy of the motif. Our signature pattern includes the Lum- 
binding motif. 

Consensus pattern: [LIVMF]-x(5)-G-[STADNQ]-[KREQIYW]-V-N-[LIVM]-E 

[ 1] O'Kane D.J., Woodward B., Lee J., Prasher D.C. Proc. Natl. Acad. Sci. U.S.A. 88:1100- 
1104(1991). 

[ 2] O'Kane D.J., Prasher D.C. Mol. Microbiol. 6:443-449(1992). 
331. Lysyl oxidase putative copper-binding region signature 

Lysyl oxidase (LOX) [1] is an extracellular copper-dependent enzyme that catalyzes the 
oxidative deamination of peptidyl lysine residues in precursors of various collagens and 
elastins. The deaminated lysines are then able to form aldehyde cross-links.LOX binds a 
single copper atom which seems to reside within an octahedral coordination complex which 
includes at least three histidine ligands. Fourhistidine residues are clustered in a central 
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region of the enzyme. This region is thought to be involved in cooper-binding and is called 
the 'copper-talon' [1]. This region was used as a signature pattern. 

Consensus pattern: W-E-W-H-S-C-H-Q-H-Y-H 

5 

[ 1] Krebs C.J., Krawetz S.A. Biochim. Biophys. Acta 1202:7-12(1993). 

332. Metallo-beta-lactamase superfamily (lactamase_B) 
10 [1] : Neuwald AF, Liu JS, Lipman DJ, Lawrence CE, Nucleic Acids Res 

1997;25:1665-1677. [2] Carfi A, Pares S, Duee E, Galleni M, Duez C, Frere JM, Dideberg O, 
EMBO J 1995;14:4914-4921. 

15 333. L-lactate dehydrogenase active site (Idhl) 

L-lactate dehydrogenase (EC 1.1.1.27 ) (LDH) [1] catalyzes the reversible NAD-dependent 
interconversion of pyruvate to L-lactate. In vertebrate muscles and in lactic acid bacteria it 
represents the final step in anaerobic glycolysis. This tetrameric enzyme is present in 
prokaryotic and eukaryotic organisms. Invertebrates there are three isozymes of LDH: the M 

2 0 form (LDH-A), found predominantly in muscle tissues; the H form (LDH-B), found in heart 
muscle and the X form (LDH-C), found only in the spermatozoa of mammals and birds. In 
birds and crocodilian eye lenses, LDH-B serves as a structural protein and is known as 
epsilon-crystallin [2].L-2-hydroxyisocaproate dehydrogenase (EC 1.1.1.-) (L-hicDH) [3] 
catalyzes the reversible and stereospecific interconversion between 2-ketocarboxylic acids 

2 5 and L-2-hydroxy-carboxylic acids. L-hicDH is evolutionary related to LDH's. As a signature 
for LDH's a region was selected that includes a conserved histidine which is essential to the 
catalytic mechanism. 

Consensus pattern: [LIVMA]-G-[EQ]-H-G-[DN]-[ST] [H is the active site residue] - 

30 

[ 1] Abad-Zapatero C, Griffith J.P., Sussman J.L., Rossmann M.G. J. Mol. Biol. 198:445- 
467(1987). 
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[ 2] Hendriks W., Mulders J.W.M., Bibby M.A., Slingsby C, Bloemendal H., de Jong W.W. 

Proc. Natl. Acad. Sci. U.S.A. 85:7114-7118(1988). 

[ 3] Lerch H.-P., Frank R., Collins J. Gene 83:263-270(1989). 

Malate dehydrogenase active site signature (ldh2) 

Malate dehydrogenase (EC 1.1.1.37 ) (MDH) [1,2] catalyzes the interconversion of malate to 
oxaloacetate utilizing the NAD/NADH cof actor system. The enzyme participates in the citric 
acid cycle and exists in all aerobic organisms. While prokaryotic organisms contains a single 
form of MDH, in eukaryotic cells there are two isozymes: one which is located in the 
mitochondrial matrix and the other in the cytoplasm. Fungi and plants also harbor a 
glyoxysomal form which functions in the glyoxylate pathway. In plants chloroplast there is 
an additional NADP-dependent form of MDH (EC 1.1.1.82 ) which is essential for both the 
universal C3 photosynthesis (Calvin) cycle and the more specializedC4 cycle. As a signature 
pattern for this enzyme a region was chosen that includes two residues involved in the 
catalytic mechanism [3]: an aspartic acid which is involved in a proton relay mechanism, and 
an arginine which binds the substrate. 

Consensus pattern: [LIVM]-T-[TRKMN]-L-D-x(2)-R-[STA]-x(3)-[LIVMFY] [D and R are 
the active site residues] - 

[ 1] McAlister-Henn L. Trends Biochem. Sci. 13:178-181(1988). 

[ 2] Gietl C. Biochim. Biophys. Acta 1100:217-234(1992). 

[ 3] Birktoft J.J., Rhodes G., Banaszak L.J. Biochemistry 28:6065-6081(1989). 

[ 4] Cendrin F., Chroboczek J., Zaccai G., Eisenberg H., Mevarech M. Biochemistry 

32:4308-4313(1993). 

334. Legume lectins signatures 

Leguminous plants synthesize sugar-binding proteins which are called legume lectins [1,2]. 
These lectins are generally found in the seeds. The exact function of legume lectins is not 
known but they may be involved in the attachment of nitrogen-fixing bacteria to legumes and 
in the protection against pathogens. Legume lectins bind calcium and manganese (or other 
transition metals). Legume lectins are synthesized as precursor proteins of about 230 to 260 
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amino acid residues. Some legume lectins are proteolytically processed to produce two 
chains: beta (which corresponds to the N-terminal) and alpha (C-terminal).The lectin 
concanavalin A (conA) from jack bean is exceptional in that the two chains are transposed 
and ligated (by formation of a new peptide bond). The N-terminus of mature conA thus 
corresponds to that of the alpha chain and the C-terminus to the beta chain. Two signature 
patterns specific to legume lectins have been developed: the first is located in the C-terminal 
section of the beta chain and contains a conserved aspartic acid residue important for the 
binding of calcium and manganese; the second one is located in the N-terminal of the alpha 
chain. 

Consensus pattern: [LIV]-[STAG]-V-[DEQV]-[FLI]-D-[ST] [D binds manganese and 
calcium] - 

Consensus pattern: [LIV]-x-[EDQ]-[FYWKR]-V-x-[LIVF]-G-[LF]-[ST]- 

[ 1] Sharon N., Lis H. FASEB J. 4:3198-320(1990). 

[ 2] Lis H., Sharon N. Annu. Rev. Biochem. 55:33-37(1986). 

335. CoA-ligases (ligases- Co A) 

This family includes the CoA ligases Succinyl-CoA synthetase alpha: and beta chains, 
malate CoA ligase and ATP-citrate lyase. Some members of the family utilise ATP others use 
OTP. 

[1] Wolodko WT, Fraser ME, James MN, Bridger WA, J Biol Chem 1994;269:10883- 

10890. 

336. linker histone HI and H5 family 

Linker histone HI is an essential component of chromatin structure. HI links 
nucleosomes into higher order structures Histone HI is replaced by histone H5 in some cell 
types. 

[1] Ramakrishnan V, Finch IT, Graziano V, Lee PL, Sweet RM, Nature 
1993;362:219-223. 
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337. Lipocalin signature (lipl) 

Proteins which transport small hydrophobic molecules such as steroids, bilins, retinoids, and 
lipids share limited regions of sequence homology and a common tertiary structure 
architecture [1 to 5]. This is an eight stranded antiparallel beta-barrel with a repeated + 1 
topology enclosing a internal ligand binding site [1,3]. The name 'lipocalin' has been 
proposed [5] for this protein family. Proteins known to belong to this family are listed below 
(references are only provided for recently determined sequences). - Alpha-1 -microglobulin 
(protein HC), which seems to bind porphyrin. - Alpha-l-acid glycoprotein (orosomucoid), 
which can bind a remarkable array of natural and synthetic compounds [6]. - Aphrodisin 
which, in hamsters, functions as an aphrodisiac pheromone. - Apolipoprotein D, which 
probably binds heme-related compounds. - Beta-lactoglobulin, a milk protein whose 
physiological function appears to bind retinol. - Complement component C8 gamma chain, 
which seems to bind retinol [7]. - Crustacyanin [8], a protein from lobster carapace, which 
binds astaxanthin, a carotenoid. - Epididymal-retinoic acid binding protein (E-RABP) [9] 
involved in sperm maturation. - Insectacyanin, a moth bilin-binding protein, and a related 
butterfly bilin- binding protein (BBP). - Late Lactation protein (LALP), a milk protein from 
tammar wallaby [10]. - Neutrophil gelatinase-associated lipocalin (NGAL) (p25) (SV-40 
induced 24p3 protein) [11]. - Odorant-binding protein (OBP), which binds odorants. - Plasma 
retinol-binding proteins (PRBP). - Human pregnancy-associated endometrial alpha-2 
globulin. - Probasin (PB), a rat prostatic protein. - Prostaglandin D synthase (EC 5.3.99.2 ') 
(GSH-independent PGD synthetase), a lipocalin with enzymatic activity [12]. - Purpurin, a 
retinal protein which binds retinol and heparin. - Quiescence specific protein p20K from 
chicken (embryo CH21 protein). - Rodent urinary proteins (alpha-2-microglobulin), which 
may bind pheromones. - VNSP 1 and 2, putative pheromone transport proteins from mouse 
vomeronasal organ [13]. - Von Ebner's gland protein (VEGP) [14] (also called tear lipocalin), 
a mammalian protein which may be involved in taste recognition. - A frog olfactory protein, 
which may transport odorants. - A protein found in the cerebrospinal fluid of the toad Bufo 
Marinus with a supposed function similar to transthyretin in transport across the blood brain 
barrier [15]. - Lizard's epididymal secretory protein IV (LESP IV), which could transport 
small hydrophobic molecules into the epididymal fluid during sperm maturation [16]. - 
Prokaryotic outer-membrane protein bic [17]. The sequences of most members of the family, 
the core or kernal lipocalins, are characterized by three short conserved stretches of residues 
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[348].Others, the outlier lipocalin group, share only one or two of these [3,18]. A signature 
pattern was built around the first, common to all outlier and kernallipocalins, which occurs 
near the start of the first beta-strand. 

Consensus pattern; [DENG]-x-[DENQGSTARK]-x(0,2)-[DENQARK]-[LIVFY]-{CP}-G- 
{C}- W-[FYWLRH]-x-[LIVMTA]- 

Note: it is suggested, on the basis of similarities of structure, function, and sequence, that this 
family forms an overall superfamily, called the calycins, with the avidin/streptavidin 
< PDOC00499 > and the cytosolic fatty- acid binding proteins < PDQC00188 > families [3,19] 

[ 1] Cowan S.W., Newcomer M.E., Jones T.A. Proteins 8:44-61(1990). 

[ 2] Igaraishi M., Nagata A., Toh H., Urade H., Hayaishi N. Proc. Natl. Acad. Sci. U.S.A. 

89:5376-5380(1992). 

[ 3] Flower D.R., North A.C.T., Attwood T.K. Protein Sci. 2:753-761(1993). 
[ 4] Godovac-Zimmermann J. Trends Biochem. Sci. 13:64-66(1988). 
[ 5] Pervaiz S., Brew K. FASEB J. 1:209-214(1987). 

[ 6] Kremer J.M.H., Wilting J., Janssen L.H.M. Pharmacol. Rev. 40:1-47(1989). 

[ 7] Haefliger J.-A., Peitsch M.C., Jenne D., Tschopp J. Mol. Immunol. 28:123-131(1991). 

[ 8] Keen J.N., Caceres I., Eliopoulos E.E., Zagalsky P.F., Findlay J.B.C. Eur. J. Biochem. 

197:407-417(1991). 

[ 9] Newcomer M.E. Structure 1:7-18(1993). 

[10] Collet C, Joseph R. Biochim. Biophys. Acta 1167:219-222(1993). 

[11] Kjeldsen L., Johnsen A.H., Sengelov H., Borregaard N. J. Biol. Chem. 268:10425- 

10432(1993). 

[12] Peitsch M.C., Boguski M.S. Trends Biochem. Sci. 16:363-363(1991). 

[13] Miyawaki A., Matsushita Y.R., Ryo Y., Mikoshiba T. EMBO J. 13:5835-5842(1994). 

[14] Kock K., Ahlers C, Schmale H. Eur. J. Biochem. 221:905-916(1994). 

[15] Achen M.G., Harms P.J., Thomas T., Richardson S.J., Wettenhall R.E.H., Schreiber G. 

J. Biol. Chem. 267:23170-23174(1992). 

[16] Morel L., Dufarre J.-P., Depeiges A. J. Biol. Chem. 268:10274-10281(1993). 
[17] Bishop R.E., Penfold S.S., Frost L.S., Holtje J.V., Weiner J.H. J. Biol. Chem. 
270:23097-23103(1995). 
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[18] Flower D.R., North A.C.T., Attwood T.K. Biochem. Biophys. Res. Commun. 180:69- 
74(1991). 

[19] Flower D.R. FEES Lett. 333:99-102(1993). 
Cytosolic fatty-acid binding proteins signature (lip2) 

A number of low molecular weight proteins which bind fatty acids and other organic anions 
are present in the cytosol [1,2]. Most of them are structurally related and have probably 
diverged from a common ancestor. This structure is a ten stranded antiparallel beta-barrel, 
albeit with a wide discontinuity between the fourth and fifth strands, with a repeated + 1 
topology enclosing an internal ligand binding site [2,7]. Proteins known to belong to this 
family include: - Six, tissue-specific, types of fatty acid binding proteins (FABPs) found in 
liver, intestine, heart, epidermal, adipocyte, brain/retina. Heart FABP is also known as 
mammary-derived growth inhibitor (MDGI), a protein that reversibly inhibits proliferation of 
mammary carcinoma cells. Epidermal FABP is also known as psoriasis-associated FABP [3]. 
- Insect muscle fatty acid-binding proteins. - Testis lipid binding protein (TLBP). - Cellular 
retinol-binding proteins I and 11 (CRBP). - Cellular retinoic acid-binding protein (CRABP). - 
Gastrotropin, an ileal protein which stimulates gastric acid and pepsinogen secretion. It seems 
that gastrotropin binds to bile salts and bilirubins. - Fatty acid binding proteins MFBl and 
MFB2 from the midgut of the insect Manduca sexta [4]. In addition to the above cytosolic 
proteins, this family also includes: - Myelin P2 protein, which may be a lipid transport 
protein in Schwann cells. P2 is associated with the lipid bilayer of myelin. - Schistosoma 
mansoni protein Sml4 [5] which seems to be involved in the transport of fatty acids. - 
Ascaris suum pi 8 a secreted protein that may play a role in sequestering potentially toxic 
fatty acids and their peroxidation products or that may be involved in the maintenance of the 
impermeable lipid layer of the eggshell. - Hypothetical fatty acid-binding proteins F40F4.2, 
F40F4.3, F40F4.4 and ZK742.5 from Caenorhabditis elegans. As a signature pattern for these 
proteins a segment from the N-terminal extremity was use. 

Consensus pattern: [GSAIVK]-x-[FYW]-x-[LIVMF]-x(4)-[NHG]-[FY]-[DE]-x- [LIVMFY]- 
[LIVM]-x(2)-[LIVMAKR]- 

Note: it is suggested, on the basis of similarities of structure, function, and sequence, that this 
family forms an overall superfamily, called the calycins, with the lipocalin < PDOC00187 > 
and avidin/streptavidin < PDOC00499 > families [6,7]. 
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[ 1] Bernier I., Jolles P. Biochimie 69:1127-1152(1987). 

[ 2] Veerkamp J.H., Peeters R.A., Maatman R. G.H.J. Biochim. Biophys. Acta 1081:1- 
24(1991). 

5 [ 3] Siegenthaler G., Hotz R., Chatellard-Gruaz D., Didierjean L., Hellman U., Saurat J.-H. 
Biochem. J. 302:363-371(1994). 

[ 4] Smith A.F., Tsuchida K., Hanneman E., Suzuki T.C., Wells M.A. J. Biol. Chem. 
267:380-384(1992). 

[ 5] Moser D., Tendler M., Griffiths G., Klinkert M.-Q. J. Biol. Chem. 266:8447-8454(1991). 
10 [6] Flower D.R., North ACT, Attwood T.K. Protein Sci. 2:753-761(1993). 
[ 7] Flower D.R. FEES Lett. 333:99-102(1993). 

338. Lipoxygenases iron-binding region signatures 
15 Lipoxygenases (EC 1.13.11.-) are a class of iron-containing dioxygenases which catalyzes the 
hydroperoxidation of lipids, containing a cis,cis-l,4-pentadiene structure. They are common 
in plants where they may be involved in a number of diverse aspects of plant physiology 
including growth and development, pest resistance, and senescence or responses to wounding 
[1]. In mammals a number of lipoxygenases isozymes are involved in the metabolism of 

2 0 prostaglandins and leukotrienes [2]. Sequence data is available for the following 

lipoxygenases: - Plant lipoxygenases (EC 1.13.11.12 V Plants express a variety of cytosolic 
isozymes as well as what seems [3] to be a chloroplast isozyme. - Mammalian arachidonate 
5-lipoxygenase (EC 1.13.11.34 ). - Mammalian arachidonate 12-Iipoxygenase (EC 
1.13.11.31 ). - Mammalian erythroid cell-specific 1 5-lipoxygenase (EC 1.13.11.33 ).The iron 
25 atom in lipoxygenases is bound by four ligands, three of which are histidine residues [4]. Six 
histidines are conserved in all lipoxygenase sequences, five of them are found clustered in a 
stretch of 40 amino acids. This region contains two of the three zinc-ligands; the other 
histidines have been shown [5] to be important for the activity of lipoxygenases. As 
signatures for this family of enzymes two patterns in the region of the histidine cluster were 

3 0 selected. The first pattern contains the first three conserved histidines and the second pattern 

includes the fourth and the fifth. 



Reference No. 2750-942P 



331 

Consensus pattern: H-[EQ]-x(3)-H-x-[LM]-[NQRC]-[GST]-H-[LIVMSTAC](3)-E [The 
second and third H's bind iron]- 

Consensus pattern: [LIVMA]-H-P-[LIVM]-x-[KRQ]-[LIVMF](2)-x-[AP]-H- 

[ 1] Vick B.A., Zimmerman D.C. (In) Biochemistry of plants: A comprehensive treatise, 
Stumpf P.K., Ed., Vol. 9, pp.53-90, Academic Press, New- York, (1987). 
[ 2] Needleman P., Turk J., Jakschik B.A., Morrison A.R., Lefkowith J.B. Annu. Rev. 
Biochem. 55:69-102(1986). 

[ 3] Peng Y.L., Shirano Y., Ohta H., Hibino T., Tanaka K., Shibata D. J. Biol. Chem. 
269:3755-3761(1994). 

[ 4] Boyington J.C., Gaffney B.J., Amzel L.M. Science 260:1482-1486(1993). 

[ 5] Steczko J., Donoho G.P., Clemens J.C., Dixon J.E., Axelrod B. Biochemistry 31:4053- 

4057(1992). 

339. Fumarate lyases signature (lyase_l) 

A number of enzymes, belonging to the lyase class, for which fumarate is a substrate have 
been shown [1,2] to share a short conserved sequence around a methionine which is probably 
involved in the catalytic activity of this type of enzymes. These enzymes are: - Fumarase (EC 
4.2.1.2 ^ (fumarate hydratase), which catalyzes the reversible hydration of fumarate to L- 
malate. There seem to be 2 classes of fumarases: class I are thermolabile dimeric enzymes (as 
for example: Escherichia coli fumC); class II enzymes are thermostable and tetrameric and 
are found in prokaryotes (as for example: Escherichia coli fumA and fumB) as well as in 
eukaryotes. The sequence of the two classes of fumarases are not closely related. - Aspartate 
ammonia-lyase (EC 4.3.1.1 ) (aspartase), which catalyzes the reversible conversion of 
aspartate to fumarate and ammonia. This reaction is analogous to that catalyzed by fumarase, 
except that ammonia rather than water is involved in the trans-elimination reaction. - 
Arginosuccinase (EC 4.3.2.1 ) (argininosuccinate lyase), which catalyzes the formation of 
arginine and fumarate from argininosuccinate, the last step in the biosynthesis of arginine. - 
Adenylosuccinase (EC 4.3.2.2 ) (adenylosuccinate lyase) [3], which catalyzes the eight step in 
the de novo biosynthesis of purines, the formation of 5'-phosphoribosyl-5-amino-4- 
imidazolecarboxamide and fumarate from l-(5- phosphoribosyl)-4-(N-succino-carboxamide). 
That enzyme can also catalyzes the formation of fumarate and AMP from adenylosuccinate. - 
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Pseudomonas putida 3-carboxy-cis,cis-muconate cycloisomerase (EC 5.5.1.2 ) (3- 
carboxymuconate lactonizing enzyme) (gene pcaB) [4], an enzyme involved in aromatic 
acids catabolism 

Consensus pattern: G-S-x(2)-M-x(2)-K-x-N- 

[ 1] Woods S.A., Shwartzbach S.D., Guest J.R. Biochim. Biophys. Acta 954:14-26(1988). 

[ 2] Woods S.A., Miles J.S., Guest J.R. FEMS Microbiol. Lett. 51:181-186(1988). 

[ 3] Zalkin H., Dixon I.E. Prog. Nucleic Acid Res. Mol. Biol. 42:259-287(1992). 

[ 4] Williams S.E., Woolridge E.M., Ransom S.C., Landro J.A., Babbitt P.C., Kozarich J.W. 

Biochemistry 31:9768-9776(1992). 

340. MCM family signature and profile 

Proteins shown to be required for the initiation of eukaryotic DNA replication share a highly 
conserved domain of about 210 amino-acid residues [1,2,3]- The latter shows some 
similarities [4] with that of various other families of DNA-dependent ATPases. Eukaryotes 
seem to possess a family of six proteins that contain this domain. They were first identified in 
yeast where most of them have a direct role in the initiation of chromosomal DNA replication 
by interacting directly with autonomously replicating sequences (ARS). They were thus 
called 'minichromosome maintenance proteins' with gene symbols prefixed by MCM. These 
six proteins are: - MCM2, also known as cdcl9 (in S.pombe) [El]. - MCM3, also known as 
DNA polymerase alpha holoenzyme-associated protein PI, RLE beta subunit or ROA. - 
MCM4, also known as CDC54, cdc21 (in S.pombe) or dpa (in Drosophila). - MCM5, also 
known as CDC46 or nda4 (in S.pombe). - MCM6, also known as mis5 (in S.pombe). - 
MCM7, also known as CDC47 or Prolifera (in A.thaliana).This family is also present in 
archebacteria. In Methanococcus jannaschiithere are four members: MJ0363, MJ0961, 
MJ1489 and MJECLlS.The presence of a putative ATP -binding domain implies that these 
proteins maybe involved in an ATP-consuming step in the initiation of DNA replication in 
eukaryotes. As a signature pattern, a perfectly conserved region was selected that represents a 
special version of the B motif found in ATP-binding proteins. 



Consensus pattern: G-[IVT]-[LVAC](2)-[IVT]-D-[DE]-[FL]-[DNST] 
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[ 1] Coxon A., Maundrell K., Kearsey S.E. Nucleic Acids Res. 20:5571-5577(1992). 

[ 2] Hu B., Burkhart R., Schulte D., Musahl C, Knippers R. Nucleic Acids Res. 21:5289- 

5293(1993). 

[ 3] Tye B.-K. Trends Cell Biol. 4:160-166(1994). 

[ 4] Koonin E.V. Nucleic Acids Res. 21:2541-2547(1993). 

341. Macrophage migration inhibitory factor family signature (MIF) 

A protein called macrophage migration inhibitory factor (MIF) [1] seems to exert an 
important role in host inflammatory responses. It play a pivotal role in the host response to 
endotoxic shock and appears to serve as a pituitary "stress" hormone that regulates systemic 
inflammatory responses. MIF is a secreted protein of 115 residues which is not processed 
from a larger precursor. D-dopachrome tautomerase [2] is a mammalian cytoplasmic enzyme 
involved in melanin biosynthesis and that tautomerizes D-dopachrome with concomitant 
decarboxylation to give 5,6-dihydroxyindole (DHI). It is a protein of 117 residues highly 
related to MIF. It must be noted that MIF binds glutathione and has been said to be related to 
glutathione S-transferases. This assertion has been later disproved [3]. As a signature pattern 
for these proteins, a conserved region was selected located in the central section. 

Consensus pattern: [DE]-P-C-A-x(3)-[LIVM]-x-S-I-G-x-[LIVM]-G- 

[ 1] Bucala R. Immunol. Lett. 43:23-26(1994). 

[ 2] Odh G., Hindemith A., Rosengren A.-M., Rosengren E., Rorsman H. Biochem. Biophys. 

Res. Commun. 197:619-624(1993). 

[ 3] Pearson W.R. Protein Sci. 3:525-527(1994). 

342. MIP family signature 

Recently the sequence of a number of different proteins, that all seem to be transmembrane 
channel proteins, has been found to be highly related [1 to 4]. These proteins are listed below. 
- Mammalian major intrinsic protein (MIP). MIP is the major component of lens fiber gap 
junctions. Gap junctions mediate direct exchange of ions and small molecule from one cell to 
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another. - Mammalian aquaporins [5]. These proteins form water-specific channels that 
provide the plasma membranes of red cells and kidney proximal and collecting tubules with 
high permeability to water, thereby permitting water to move in the direction of an osmotic 
gradient. - Soybean nodulin-26, a major component of the peribacteroid membrane induced 
during nodulation in legume roots after Rhizobium infection. - Plants tonoplast intrinsic 
proteins (TIP). There are various isoforms of TIP: alpha (seed), gamma, Rt (root), and Wsi 
(water-stress induced). These proteins may allow the diffusion of water, amino acids and/or 
peptides from the tonoplast interior to the cytoplasm. - Bacterial glycerol facilitator protein 
(gene glpF), which facilitates the movement of glycerol across the cytoplasmic membrane. - 
Salmonella typhimurium propanediol diffusion facilitator (gene pduF). - Yeast FPSl, a 
glycerol uptake/efflux facilitator protein. - Drosophila neurogenic protein 'big brain' (bib). 
This protein may mediate intercellular communication; it may functions by allowing the 
transport of certain molecules(s) and thereby sending a signal for an exodermal cell to 
become an epidermoblast instead of a neuroblast. - Yeast hypothetical protein YFL054c. - A 
hypothetical protein from the pepX region of lactococcus lactis. The MIP family proteins 
seem to contain six transmembrane segments. Computer analysis shows that these protein 
probably arose by a tandem, intragenic duplication event from an ancestral protein that 
contained three transmembrane segments. As a signature pattern a well conserved region was 
selected which is located in a probable cytoplasmic loop between the second and third 
transmembrane regions. 

Consensus pattern: [HNQA] -x -N-P-[STA] -[LIVMF] - [ST]-[LI VMF] -[GSTAFY] - 

[ 1] Reizer J., Reizer A., Saier M.H. Jr. CRC Crit. Rev. Biochem. 28:235-257(1993). 
[ 2] Baker M.E., Saier M.H. Jr. Cell 60:185-186(1990). 

[ 3] Pao G.M., Wu L.-F., Johnson K.D., Hoefte H., Chrispeels M.J., Sweet G., Sandal N.N., 
Saier M.H. Jr. Mol. Microbiol. 5:33-37(1991). 

[ 4] Wistow G.J., Pisano M.M., Chepelinsky A.B. Trends Biochem. Sci. 16:170-171(1991). 
[ 5] Chrispeels M.J., Agre P. Trends Biochem. Sci. 19:421-425(1994). 

343. Mandelate racemase / muconate lactonizing enzyme family signatures 
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Mandelate racemase (EC 5.1.2.2 ) (MR) and muconate lactonizing enzyme(EC 5.5.1.1 1 
(MLE) are two bacterial enzymes involved in aromatic acid catabolism. They catalyze 
mechanistically distinct reactions yet they are related at the level of their primary, quaternary 
(homooctamer) and tertiary structures [1,2], A number of other proteins also seem to be 
5 evolutionary related to these two enzymes. These are: - The various plasmid-encoded 
chloro muconate cycloisomerases (EC 5.5.1.7 ). - Escherichia coli protein rspA [3], rspA 
seems to be involved in the degradation of homoserine lactone (HSL) or of one of its 
metabolite. - Escherichia coli hypothetical protein ycjG. - Escherichia coli hypothetical 
protein yidU. - A hypothetical protein from Streptomyces ambofaciens [4]. Two signature 
1 0 patterns have been developed for these enzymes; both contain conserved acidic residues. The 
second pattern contains an aspartate and a glutamate which are ligands for either a 
magnesium ion (in MR) or a manganese ion (inMLE). 



Consensus pattern: A-x-[SAGCN]-[SAG]-[LIVM]-[DEQ]-x-A-[LA]-x-[DE]-[LIA]-x- [GA]- 
1 5 [KRQ]-x(4)-[PSA]-[LIV]-x(2)-L-[LIVMF]-G- 

Consensus pattern: [LIVF]-x(2)-D-x-[NH]-x(7)-[ACL]-x(6)-[LIVMF]-x(7)-[LIVM]- E- 
[DENQ]-P [D and E bind a divalent metal ion]- 



[ 1] Neidhart D.J., Kenyon G.L., Gerlt J.A., Petsko G.A. Nature 347:692-694(1990). 
2 0 [2] Petsko G.A., Kenyon G.L., Gerlt J.A., Ringe D., Kozarich J.W. Trends Biochem. Sci. 
18:372-376(1993). 

[ 3] Huisman G.W., Kolter R. Science 265:537-539(1994). 

[ 4] Schneider D., Aigle B., Leblond P., Simonet J.M., Decaris B. J. Gen. Microbiol. 
139:2559-2567(1993). 



344. Merozoite Surface Antigen 2 (MSA-2) family 

Thomas AW, Carr DA, Carter JM, Lyon JA, Mol Biochem Parasitol 1990;43:211- 

220. 

30 



345. MSP (Major sperm protein) domain. 
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Major sperm proteins are involved in sperm motility. These proteins oligomerise to 
form filaments. Partial matches to this domain are also found in other non MSP proteins. 
These include Swiss :P40075 and Swiss:P34593 . 

[1] Bullock TL, Roberts TM, Stewart M, J Mol Biol 1996;263:284-296. [2] King KL, 
5 Stewart M, Roberts TM, Seavy M, J Cell Sci 1992;101:847-857. 



346. (Matrix) Viral matrix protein. Found in Morbillivirus and paramyxovirus, pneumovirus. 
Number of members: 105 

10 

347. O-methyltransferase (methyltransf) 

This family includes a range of 0-methyltransferases. These enzymes utilise S- 
adenosyl methionine. 

15 [1] Keller NP, Dischinger HC, Bhatnagar D, Cleveland TE, Ullah AH, Appl Environ 

Microbiol 1993;59:479-484. 



348. Magnesium chelatase, subunit Chll 

2 0 Magnesium-chelatase is a three-component enzyme that catalyses the insertion of 

Mg2+ into protoporphyrin IX. This is the first unique step in the synthesis of 
(bacterio)chlorophyll. Due to this, it is thought that Mg-chelatase has an important role in 
channeling inter- mediates into the (bacterio)chlorophyll branch in response to conditions 
suitable for photosynthetic growth. Chll and BchD have molecular weight between 38-42 

2 5 kDa. 

[1] Walker CJ, Willows RD, Biochem J 1997;327:321-333. [2] Petersen BL, Jensen 
PE, Gibson LC, Stummann BM, Hunter CN, Henningsen KW, J Bacteriol 1998;180:699-704. 



30 



349. Plasmid recombination enzyme (Mob_Pre) 

With some plasmids, recombination can occur in a site specific manner that is 
independent of RecA. In such cases, the recombination event requires another protein called 
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Pre. Pre is a plasmid recombination enzyme. This protein is: also known as Mob (conjugative 
mobilization). 

[1] Priebe SD, Lacks SA, J Bacteriol 1989;171:4778-4784. 

350. Monooxygenase 

This family includes diverse enzymes that utilise FAD. 

[1] Gatti DL, Palfey BA, Lah MS, Entsch B, Massey V, Ballou DP, Ludwig ML, 
Science 1994;266:110-114. 

351. Mov34 family 

Members of this family are found in proteasome regulatory subunits, eukaryotic 
initiation factor 3 (eIF3) subunits and regulators of transcription factors. 

[1] Aravind L, Ponting CP, Protein Sci 1998;7:1250-1254. [2] Hershey JW, Asano K, 
Naranda T, Vornlocher HP, Hanachi P, Merrick WC, Biochimie 1996;78:903-907. 

352. Myc amino-terminal region (Myc_N_term) 

The myc family belongs to the basic helix-loop-helix leucine zipper class of 
transcription factors, see HLH . Myc forms a heterodimer with Max, and this complex 
regulates cell growth through direct activation of genes involved in cell replication [2]. 

[1] Facchini LM, Penn LZ, FASEB J 1998;12:633-651. [2] Grandori C, Eisenman 
RN, Trends Biochem Sci 1997;22:177-181. 

353. (Metallothio_2) Metallothionein. Members of this family are metallothioneins. These 
proteins are cysteine rich proteins that bind to heavy metals. Members of this family appear 
to be closest to Class II metallothioneins, seed metalthio. Number of members: 55 

[1] Medline: 98267202. Characterization of gene repertoires at mature stage of citrus fruits 
through random sequencing and analysis of redundant metallothionein-like genes expressed 
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during fruit development. Moriguchi T, Kita M, Hisada S, Endo-Inagaki T, Omura M; Gene 
1998;211:221-227. 

354. MAGE family 

The MAGE (melanoma antigen-encoding gene) family are expressed 
in a wide variety of tumors but not in normal cells, with the 
exception of the male germ cells, placenta, and, possibly, 
cells of the developing embryo. The cellular function of 
this family is unknown. 

[1] McCurdy DK, Tai LQ, Nguyen J, Wang Z, Yang HM, Udar N, 
Naiem F, Concannon P, Gatti RA; Mol Genet Metab 1998;63:3-13. 

355. Malic enzymes signature. Malic enzymes, or malate oxidoreductases, catalyze the 
oxidative decarboxylation of malate into pyruvate important for a wide range of metabolic 
pathways. There are three related forms of malic enzyme [1,2,3]: - NAD-dependent malic 
enzyme (EC 1.1.1.38), which uses preferentially NAD and has the ability to decarboxylate 
oxaloacetate (OAA). It is found in bacteria and insects. - NAD-dependent malic enzyme (EC 
1.1.1.39), which uses preferentially NAD and is unable to decarboxylate OAA. It is found in 
the mitochondrial matrix of plants and is a heterodimer of highly related subunits. - NADP- 
dependent malic enzyme (EC 1.1 .1.40 ). which has a preference for NADP and has the ability 
to decarboxylate OAA. This form has been found in fungi, animals and plants. In mammals, 
there are two isozymes: one, mitochondrial and the other, cytosolic. Plants also have two 
isozymes: chloroplastic and cytosolic. There are two other proteins which are closely 
structurally related to malicenzymes: - Escherichia coli protein sfcA, whose function is not 
yet known but which could be an NAD or NADP-dependent malic enzyme. - Yeast 
hypothetical protein YKL029c, a probable malic enzyme. There are three well conserved 
regions in the enzyme sequences. Two of them seem to be involved in binding NAD or 
NADP. The significance of the third one, located in the central part of the enzymes, is not yet 
known. This region has been developed as a signature pattern for these enzymes. 
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Consensus pattern: F-x-[DV]-D-x(2)-G-T-[GSA]-x-[IV]-x-[UVMA]-[GAST](2)- 
[LIVMF](2)- 

[ 1] Artus N.N., Edwards G.E. FEES Lett. 182:225-233(1985). [ 2] Loeber G., Infante A. A., 
Maurer-Fogy I., Krystek K, Dworkin M.B. J. Biol. Chem. 266:3016-3021(1991). [ 3] Long 
J.J., Wang J.-L., Berry J.O. J. Biol. Chem. 269:2827-2833(1994). 

356. (matrixin) 

Matrixins cysteine switch (aka peptidase_M10) 

Mammalian extracellular matrix metalloproteinases (EC 3.4.24.-), also known as matrixins 
[1] (see <PDOC00129>), are zinc-dependent enzymes. They are secreted by cells in an 
inactive form (zymogen) that differs from the mature enzyme by the presence of an N- 
terminal propeptide. A highly conserved octapeptide is found two residues downstream of the 
C-terminal end of the propeptide. This region has been shown to be involved in 
autoinhibition of matrixins [2,3]; a cysteine within the octapeptide chelates the active site 
zinc ion, thus inhibiting the enzyme. This region has been called the 'cysteine switch' or 
'autoinhibitor region'. 

A cysteine switch has been found in the following zinc proteases: 

- MMP-1 (EC 3.4.24.7) (interstitial collagenase). 

- MMP-2 (EC 3.4.24.24) (72 Kd gelatinase). 

- MMP-3 (EC 3.4.24.17) (stromelysin-1). 

- MMP-7 (EC 3.4.24.23) (matrilysin). 

- MMP-8 (EC 3.4.24.34) (neutrophil collagenase). 

- MMP-9 (EC 3.4.24.35) (92 Kd gelatinase). 

- MMP-10 (EC 3.4.24.22) (stromelysin-2). 

- MMP-11 (EC 3.4.24.-) (stromelysin-3). 

- MMP-12 (EC 3.4.24.65) (macrophage metalloelastase). 

- MMP-13 (EC 3.4.24.-) (collagenase 3). 
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- MMP-14 (EC 3.4.24.-) (membrane-type matrix metalliproteinase 1). 

- MMP-15 (EC 3.4.24.-) (membrane-type matrix metalliproteinase 2). 

- MMP-16 (EC 3.4.24.-) (membrane-type matrix metalliproteinase 3). 

- Sea urchin hatching enzyme (EC 3.4.24.12) (envelysin) [4]. 

- Chlamydomonas reinhardtii gamete lytic enzyme (GLE) [5]. 

Consensus pattemP-R-C-[GN]-x-P-[DR]-[LIVSAPKQ] [C chelates the zinc ion] Sequences 
known to belong to this class detected by the pattern ALL, except for cat MMP-7 and mouse 
MMP-11. 

[ 1] Woessner J. Jr. FASEB J. 5:2145-2154(1991). 

□ 

[ 2] Sanchez-Lopez R., Nicholson R., Gesnel M.C., Matrisian L.M., Breathnach R. J. Biol. 
Chem. 263:11892-11899(1988). 

[ 3] Park A.J., Matrisian L.M., Kells A.F., Pearson R., Yuan Z., Navre M. J. Biol. Chem. 
266:1584-1590(1991). 

[ 4] Lepage T., Cache C. EMBO J. 9:3003-3012(1990). 

[ 5] Kinoshita T., Fukuzawa H., Shimada T., Saito T., Matsuda Y. Proc. Natl. Acad. Sci. 
U.S.A. 89:4693-4697(1992). 

357. Vertebrate metallothioneins signature (metalthio) 

Metallothioneins (MX) [1,2,3] are small proteins which bind heavy metals such as zinc, 
copper, cadmium, nickel, etc., through clusters of thiolate bonds. MT's occur throughout the 
animal kingdom and are also found in higher plants, fungi and some prokaryotes. On the 
basis of structural relationships MT's have been subdivided into three classes. Class I includes 
mammalian MT's as well as MT's from crustacean and molluscs, but with clearly related 
primary structure. Class II groups together MT's from various species such as sea urchins, 
fungi, insects and cyanobacteria which display none or only very distant correspondence to 
class I MT's. Class III MT's are atypical polypeptides containing gamma-glutamylcysteinyl 
units. Vertebrate class I MT's are proteins of 60 to 68 amino acid residues, 20 of these 
residues are cysteines that bind to 7 bivalent metal ions. As a signature pattern a region that 
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spans 19 residues and which contains seven of the metal-binding cysteines was chosen, this 
region is located in the N-terminal section of class-I MT's. 

Consensus pattern: C-x-C-[GSTAP]-x(2)-C-x-C-x(2)-C-x-C-x(2)-C-x-K- 

[ 1] Hamer D.H. Annu. Rev. Biochem. 55:913-951(1986). 

[ 2] Kagi J.H.R., Schaffer A. Biochemistry 27:8509-8515(1988). 

[ 3] Binz P.-A. Thesis, 1996, University of Zurich. 

358. Mitochondrial energy transfer proteins signature (rn[ito_ carr) 

Different types of substrate carrier proteins involved in energy transfer are found in the inner 
mitochondrial membrane [1 to 5]. These are: - The ADP,ATP carrier protein (AAC) 
(ADP/ATP translocase) which exports ATP into the cytosol and imports ADP into the 
mitochondrial matrix. The sequence of AAC has been obtained from various mammalian, 
plant and fungal species. - The 2-oxoglutarate/malate carrier protein (OGCP), which exports 
2-oxoglutarate into the cytosol and imports malate or other dicarboxylic acids into the 
mitochondrial matrix. This protein plays an important role in several metabolic processes 
such as the malate/aspartate and the oxoglutarate/isocitrate shuttles. - The phosphate carrier 
protein, which transports phosphate groups from the cytosol into the mitochondrial matrix. - 
The brown fat uncoupling protein (UCP) which dissipates oxidative energy into heat by 
transporting protons from the cytosol into the mitochondrial matrix. - The tricarboxylate 
transport protein (or citrate transport protein) which is involved in citrate-H+/malate 
exchange. It is important for the bioenergetics of hepatic cells as it provides a carbon source 
for fatty acid and sterol biosyntheses, and NAD for the glycolytic pathway. - The Grave's 
disease carrier protein (GDC), a protein of unknown function recognized by IgG in patients 
with active Grave's disease. - Yeast mitochondrial proteins MRS3 and MRS4. The exact 
function of these proteins is not known. They suppress a mitochondrial splice defect in the 
first intron of the COB gene and may act as carriers, exerting their suppressor activity by 
modulating solute concentrations in the mitochondrion. - Yeast mitochondrial FAD carrier 
protein (gene FLXl). - Yeast protein ACRl [6], which seems essential for acetyl-CoA 
synthetase activity. - Yeast protein PET8. - Yeast protein PMT. - Yeast protein RIM2. - Yeast 
protein YHMl/SHMl. - Yeast protein YMCl. - Yeast protein YMC2. - Yeast hypothetical 
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proteins YBR291c, YEL006w, YER053c, YFR045w, YHROOZw, and YIL006w. - 
Caenorhabditis elegans hypothetical protein KllH3.3.Two other proteins have been found to 
belong to this family, yet are not localized in the mitochondrial inner membrane: - Maize 
amyloplast Brittle-1 protein. This protein, found in the endosperm of kernels, could play a 
role in amyloplast membrane transport. - Candida boidinii peroxisomal membrane protein 
PMP47 [7]. PMP47 is an integral membrane protein of the peroxisome and it may play a role 
as a transporter. These proteins all seem to be evolutionary related. Structurally, they 
consistof three tandem repeats of a domain of approximately one hundred residues. Each of 
these domains contains two transmembrane regions. As a signature pattern, one of the most 
conserved regions in the repeated domain was selected, located just after the first 
transmembrane region. 

Consensus pattern: P-x-[DE]-x-[LIVAT]-[RK]-x-[LRH]-[LIVMFY]-[QGAIVM]- 

[ 1] Klingenberg M. Trends Biochem. Sci. 15:108-112(1990). 

[ 2] Walker J.E. Curr. Opin. Struct. Biol. 2:519-526(1992). 

[ 3] Kuan J., Saier M.H. Jr. CRC Crit. Rev. Biochem. 28:209-233(1993). 

[ 4] Kuan J., Saier M.H. Jr. Res. Microbiol. 144:671-672(1993). 

[ 5] Nelson D.R., Lawson J.E., Klingenberg M., Douglas M.G. J. Mol. Biol. 230:1159- 
1170(1993). 

[ 6] Palmieri F. FEBS Lett. 346:48-54(1994). 

[ 7] Jank B., Habermann B., Schweyen R.J., Link T.A. Trends Biochem. Sci. 18:427- 
428(1993). 

359. Prokaryotic molybdopterin oxidoreductases signatures (molybdopterin) 
A number of different prokaryotic oxidoreductases that require and bind amolybdopterin 
cofactor have been shown [1,2,3] to share a number of regions of sequence similarity. These 
enzymes are: - Escherichia coli respiratory nitrate reductase (EC 1.7.99.4 ). This enzyme 
complex allows the bacteria to use nitrate as an electron acceptor during anaerobic growth. 
The enzyme is composed of three different chains: alpha, beta and gamma. The alpha chain 
(gene narG) is the molybdopterin-binding subunit. Escherichia coli encodes for a second, 
closely related, nitrate reductase complex which also contains a molybdopterin-binding alpha 
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chain (gene narZ). - Escherichia coli anaerobic dimethyl sulfoxide reductase (DMSO 
reductase). DMSO reductase is the terminal reductase during anaerobic growth on various 
sulfoxide and N-oxide compounds. DMSO reductase is composed of three chains: A, B and 
C. The A chain (gene dmsA) binds molybdopterin. - Escherichia coli biotin sulfoxide 
5 reductases (genes bisC and bisZ). This enzyme reduces a spontaneous oxidation product of 
biotin, BDS, back to biotin. It may serve as a scavenger, allowing the cell to use biotin 
sulfoxide as a biotin source. - Methanobacterium formicicum formate dehydrogenase (EC 
1.2.1.2 ). The alpha chain (gene fdhA) of this dimeric enzyme binds a molybdopterin cofactor. 
- Escherichia coli formate dehydrogenases -H (gene fdhF), -N (gene fdnG) and -O (gene 

1 0 fdoG). These enzymes are responsible for the oxidation of formate to carbon dioxide. In 
addition to molybdopterin, the alpha (catalytic) subunit also contains an active site, 
selenocysteine. - Wolinella succinogenes polysulfide reductase chain. This enzyme is a 
component of the phosphorylative electron transport system with polysulfide as the terminal 
acceptor. It is composed of three chains: A, B and C. The A chain (gene psrA) binds 

1 5 molybdopterin. - Salmonella typhimurium thiosulfate reductase (gene phsA), - Escherichia 
coli trimethylamine-N-oxide reductase (EC 1.6.6.9 ) (gene torA) [4]. - Nitrate reductase (EC 
1.7.99.4 ) from Klebsiella pneumoniae (gene nasA), Alcaligenes eutrophus, Escherichia coli, 
Rhodobacter sphaeroides, Thiosphaera pantotropha (gene napA), and Synechococcus PCC 
7942 (gene narB).These proteins range from 715 amino acids (fdhF) to 1246 amino acids 

2 0 (narZ) insize. Three signature patterns for these enzymes were derived. The first is based on a 
conserved region in the N-terminal section and contains two cysteine residues perhaps 
involved in binding the molybdopterin cofactor. It should be noted that this region is not 
present in bisC. The second pattern is derived from a conserved region located in the central 
part of these enzymes. 

25 

Consensus pattern : [STAN] -x- [CH] -x(2,3)-C- [STAG] - [GSTVMF] -x-C-x- [LIVMFY W] -x - 
[LIVMA]-x(3,4)-[DENQKHT]- 

Consensus pattern: [STA]-x-[STAC](2)-x(2)-[STA]-D-[LIVMY](2)-L-P-x-[STAC](2)- x(2)- 
E- 

3 0 Consensus pattern: A-x(3)-[GDT]-I-x-[DNQTK]-x-[DEA]-x-[LIVM]-x-[LIVMC]-x- [NS]- 
x(2)-[GS]-x(5)-A-x-[LIVM]-[ST]- 
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[ 1] Wootton J.C., Nicolson R.E., Cock J.M., Walters D.E., Burke J.F., Doyle W.A., Bray 
R.C. Biochim. Biophys. Acta 1057:157-185(1991). 

[ 2] Bilous P.T., Cole S.T., Anderson W.F., Weiner J.H. Mol. Microbiol. 2:785-795(1988). 
[ 3] Trieber C.A., Rothery R.A., Weiner J.H. J. Biol. Chem. 269:7103-7109(1994). 
[ 4] Mejean V., Lobbi-Nivol C, Lepelletier M., Giordano G., Chippaux M., Pascal M.-C. 
Mol. Microbiol. 11:1169-1179(1994). 



360. Bacterial mutT domain signature 

The bacterial mutT protein is involved in the GO system [1] responsible for removing an 
oxidatively damaged form of guanine (8-hydroxyguanine or7,8-dihydro-8-oxoguanine) from 
DNA and the nucleotide pool. 8-oxo-dGTP is inserted opposite to dA and dC residues of 
template DNA with almost equal efficiency thus leading to A.T to G.C transversions. MutT 
specifically degrades 8-oxo-dGTP to the monophosphate with the concomitant release of 
pyrophosphate. MutT is a small protein of about 12 to 15 Kd. It has been shown [2,3] that a 
region of about 40 amino acid residues, which is found in the N-terminal part of mutT, can 
also be found in a variety of other prokaryotic, viral, and eukaryotic proteins. These proteins 
are: 

- Streptomyces pneumoniae mutX. 

A mutT homolog from plasmid pSAM2 of Streptomyces ambofaciens. 

Bartonella bacilliformis invasion protein A (gene invA). 

Escherichia coli dATP pyrophosphohydrolase. 

Protein D250 from African swine fever viruses. 

Proteins D9 and DIO from a variety of poxviruses. 

Mammalian 7,8-dihydro-8-oxoguamne triphosphatase (EC 3.1.6.-) [4]. 

Mammalian diadenosine 5',5"'-Pl,P4-tetraphosphate asymmetrical hydrolase 

(Ap4Aase) (EC 3.6.1.17 ) [5], which cleaves A-5'-PPPP-5A to yield AMP and 

ATP. 

- A protein encoded on the antisense RNA of the basic fibroblast growth factor 
gene in higher vertebrates. 

- Yeast protein YSAl. 

- Escherichia coli hypothetical protein yfaO. 
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- Escherichia coli hypothetical protein ygdU and HI0901, the corresponding 
Haemophilus influenzae protein. 

- Escherichia coli hypothetical protein yjaD and HI0432, the corresponding 
Haemophilus influenzae protein. 

Escherichia coli hypothetical protein yrfE. 

- Bacillus subtilis hypothetical protein yqkG. 
Bacillus subtilis hypothetical protein yzgD. 

- Yeast hypothetical protein YGL067w. 

It is proposed [2] that the conserved domain could be involved in the active center of 
a family of pyrophosphate-releasing NTPases. As a signature pattern the core region of the 
domain was selected; it contains four conserved glutamate residues. 

Consensus pattern: G-x(5)-E-x(4)-[STAGC]-[LIVMAC]-x-R-E-[LIVMFT]-x-E-E- 

[1] Michaels M.L., Miller J.H. J. Bacteriol. 174:6321-6325(1992). 
[2] Koonin E.V. Nucleic Acids Res. 21:4847-4847(1993). 

[3] Mejean V., Salles C, Bullions M.J., Bessman M.J., Claverys J.-P. Mol. Microbiol. 
11:323-330(1994). 

[4] Sakumi K., Furuichi M., Tsuzuki T., Kakuma T., Kawabata S., Maki H., Sekiguchi M. J. 
Biol. Chem. 268:23524-23530(1993). 

[5] Thome N.M.H., Hankin S., Wilkinson M.C., Nunez C, Barraclough R., McLennan A.G. 
Biochem. J. 311:717-721(1995). 

361. Myb DNA-binding domain repeat signatures 

The retroviral oncogene v-myb , and its cellular counterpart c-myb, encodenuclear DNA- 
binding proteins that specifically recognize the sequence YAAC(G/T)G [1]. The myb family 
also includes the following proteins: - Drosophila D-myb [2]. - Vertebrate myb-like proteins 
A-myb and B-myb [3]. - Maize CI protein, a trans-acting factor which controls the 
expression of genes involved in anthocyanin biosynthesis. - Maize P protein [4], a trans- 
acting factor which regulates the biosynthetic pathway of a flavonoid-derived pigment in 
certain floral tissues. - Arabidopsis thaliana protein GLl [5], required for the initiation of 
differentiation of leaf hair cells (trichomes). - A number of myb/cl-related proteins in maize 
and barley, whose roles are not yet known [4]. - Yeast BASl [7], a transcriptional activator 
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for the HIS4 gene. - Yeast REBl [8], which recognizes sites within both the enhancer and the 
promoter of rRNA transcription, as well as upstream of many genes transcribed by RNA 
polymerase 11. - Fission yeast cdc5, a possible transcription factor whose activity is required 
for cell cycle progression and growth during G2. - Fission yeast mybl, which regulates 
telomere length and function. - Yeast hypothetical protein YMR213w.One of the most 
conserved regions in all of these proteins is a domain of 160amino acids. It consists of three 
tandem repeats of 51 to 53 amino acids. In myb, this repeat region has been shown [9] to be 
involved in DNA-binding. The major part of the first repeat is missing in retroviral v-myb 
sequences and in plant myb-related proteins. Yeast REBl differs from the other proteins in 
this family in having a single myb-like domain. As shown in the following schematic 
representation, two signature patterns for myb-like domains were developed; the first is 
located in the N-terminal section, the second spans the C-terminal extremity of the domain, 
xxxxxxxxx WxxxEDxxxxxxxxxxxxxxWxxIxxxxxxRxxxxxxxxWxxxx ********* 
. Position of the patterns. 

Consensus pattern: W-[ST]-x(2)-E-[DE]-x(2)-[LIV]- 

Consensus pattern: W-x(2)-[LI]-[SAG]-x(4,5)-R-x(8)-[YW]-x(3)-[LIVM]- 

Note: this pattern detects the three copies of the domain in myb, d-myb, A-myb and B-myb; 

the second of the two complete copies of plant myb-related proteins, and the last two copies 

of yeast BASl 

[ 1] Biednkapp H., Borgmeyer U., Sippel A.E., Klempnauer K.-H. Nature 335:835- 
837(1988). 

[ 2] Peters C.W.B., Sippel A.E., Vingron M., Klempnauer K.-H. EMBO J. 6:3085- 
3090(1987). 

[ 3] Nomura N., Takahashi M., Matsui M., Ishii S., Date T., Sasamoto S., Ishizaki R. Nucleic 
Acids Res. 16:11075-11090(1988). 

[ 4] Grotewold E., Athma P., Peterson T. Proc. Natl. Acad. Sci. U.S.A. 88:4587-4591(1991). 
[ 5] Oppenheimer D.G., Herman P.L., Sivakumaran S., Esch J., Marks M.D. Cell 67:483- 
493(1991). 

[ 6] Marocco A., Wissenbach M., Becker D., Paz-Ares J., Saedler H., Salamini F., Rohde W. 
Mol. Gen. Genet. 216:183-187(1989). 

[ 7] Tice-Baldwin K., Fink G.R., Arndt K.T. Science 246:931-935(1989). 
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[ 8] Ju Q., Morrow B.E., Warner J.R. Mol. Cell. Biol. 10:5226-5234(1990). 
[ 9] Klempnauer K.-H., Sippel A.E. EMBO J. 6:2719-2725(1987). 

362. NAD-dependent glycerol-3-phosphate dehydrogenase signature 
NAD-dependent glycerol-3-phosphate dehydrogenase (EC 1.1.1.8 ) (GPD) catalyzes the 
reversible reduction of dihydroxyacetone phosphate to glycerol-3- phosphate. It is a 
eukaryotic cytosolic homodimeric protein of about 40 Kd. As a signature pattern a glycine- 
rich region that is probably [1] involved in NAD-binding was selected. 

Consensus pattern: G-[AT]-[LIVM]-K-[DN]-[LIVM](2)-A-x-[GA]-x-G-[LIVMF]-x- [DE]- 
G-[LIVM]-x-[LIVMFYW]-G-x-N- 

[ 1] Otto J., Argos P., Rossmann M.G. Eur. J. Biochem. 109:325-330(1980). 

363. Nucleosome assembly protein (NAP) 

It is thought that NAPs may be involved in regulating gene expression as a result of 
histone accessibility [1]. 

[1] Rodriguez P, Munroe D, Prawitt D, Chu LL, Brie E, Kim J, Reid LH, Davies C, 
Nakagama H, Loebbert R, Winterpacht A, Petruzzi MJ, Higgins MJ, Nowak N, Evans G, 
Shows T, Weissman BE, Zabel B, Housman DE, Pelletier J, Genomics 1997;44:253-265. [2] 
Schnieders F, Dork T, Arnemann J, Vogel T, Werner M, Schmidtke J; Hum Mol Genet 
1996;5:1801-1807. 

364. NB-ARC domain 

van der Biezen EA, Jones JD, Curr Biol 1998;8:226-227. 

365. Nucleoside diphosphate kinases active site 

Nucleoside diphosphate kinases (EC 2.7.4.6 ) (NDK) [1] are enzymes required for the 
synthesis of nucleoside triphosphates (NTP) other than ATP. They provide NTPs for nucleic 
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acid synthesis, CTP for lipid synthesis, UTP for polysaccharide synthesis and GTP for 
protein elongation, signal transduction and microtubule polymerization. In eukaryotes, there 
seems to be a small family of NDK isozymes each of which acts in a different subcellular 
compartment and/or has a distinct biological function. Eukaryotic NDK isozymes are 
hexamers of two highly related chains (Aand B) [2]. By random association (A6, A5B...AB5, 
B6), these two kinds of chain form isoenzymes differing in their isoelectric point. NDK are 
proteins of 17 Kd that act via a ping-pong mechanism in which a histidine residue is 
phosphorylated, by transfer of the terminal phosphate group from ATP. In the presence of 
magnesium, the phosphoenzyme can transfer its phosphate group to any NDP, to produce an 
NTP.NDK isozymes have been sequenced from prokaryotic and eukaryotic sources. It has 
also been shown [3] that the Drosophila awd (abnormal wing discs) protein, is a microtubule- 
associated NDK. Mammalian NDK is also known as metastasis inhibition factor nm23.The 
sequence of NDK has been highly conserved through evolution. There is a single histidine 
residue conserved in all known NDK isozymes, which is involved in the catalytic mechanism 
[2]. Our signature pattern contains this residue. 

Consensus pattern: N-x(2)-H-[GA]-S-D-[SA]-[LIVMPKNE] [H is the putative active site 
residue] - 

[ 1] Parks R., Agarwal R. (In) The Enzymes (3rd edition) 8:307-334(1973). 

[ 2] Gilles A.-M., Presecan E., Vonica A., Lascu I. J. Biol. Chem. 266:8784-8789(1991). 

[ 3] Biggs J., Hersperger E., Steeg P.S., Liotta L.A., Shearn A. Cell 63:933-940(1990). 

366. Nitrite and sulfite reductases iron-sulfur/siroheme-binding site (NIR_SIR) 
Nitrite reductases (NiR) [1] catalyze the reduction of nitrite into ammonium, the second step 
in the assimilation of nitrate. There are two types of NiR: the higher plant chloroplastic form 
of NiR (EC 1.7.7.1 ) is a monomeric protein that uses reduced ferredoxin as the electron 
donor; while fungal and bacterial NiR (EC 1.6.6.4 ) are homodimeric proteins that uses 
NAD(P)H as the electron donor. Both forms of NiR contain a siroheme-Fe and iron-sulfur 
centers. Sulfite reductase (NADPH) (EC 1.8.1.2 ^ (SIR) [2] is the bacterial enzyme that 
catalyzes the reduction of sulfite to sulfide. SIR is an oligomeric enzyme with a subunit 
composition of alpha(8)-beta(4), the alpha component is a flavoprotein (SIR-FP), while the 



Reference No. 



2750-942P 



349 

beta component is a siroheme, iron-sulfurprotein (SIR-HP).Sulfite reductase (ferredoxin) (EC 
1.8.7.1) [3] is a cyanobacterial and plant monomeric enzyme that also catalyzes the reduction 
of sulfite to sulfide. Anaerobic sulfite reductase (EC 1.8.1.-) (ASR) [4], a bacterial enzyme 
that catalyzes the NADH-dependent reduction of sulfite to sulfide. ASR is an oligomeric 
enzyme composed of three different subunits. The C component (geneasrC) seems to be a 
siroheme, iron-sulfur protein. These enzymes share a region of sequence similarity in their C- 
terminal half; this region which spans about 80 amino acids includes four conserved cysteine 
residues. Two of the Cys are grouped together at the beginning of the domain, and the two 
others are grouped in the middle of the domain. The cysteines are involved in the binding of 
the iron-sulfur center; the last one also binds the siroheme group [2]. A signature pattern from 
the region around the second cluster of cysteines was derived. 

Consensus pattern: [STV]-G-C-x(3)-C-x(6)-[DE]-[UVMF]-[GAT]-[LIVMF] [The two C's 
are ison-sulfur ligands]- 

[ 1] Campbell W.H., Kinghorn J.R. Trends Biochem. Sci. 15:315-319(1990). 

[ 2] Crane B.R., Siegel L.M., Getzoff E.D. Science 270:59-67(1995^ 

[ 3] Gisselmann G., Klausmeier P., Schwenn J.D. Biochim. Biophys. Acta 1144:102- 

106(1993). 

[ 4] Huang C.J., Barrett E.L. J. Bacteriol. 173:1544-1553(1991). 

367. (NMT) Myristoyl-CoA:protein N-myristoyltransferase signatures. Myristoyl-CoA: 
protein N-myristoyltransferase (EC 2.3.1.97 ) (Nmt) [1] is the enzyme responsible for 
transferring a myristate group on the N-terminal glycine of a number of cellular eukaryotic 
and viral proteins. Nmt is a monomeric protein of about 50 to 60 Kd whose sequence appears 
to be well conserved. Two highly conserved regions have been developed as signature 
patterns. The first one is located in the central section, the second in the C-terminal part. 

Consensus pattern: E-I-N-F-L-C-x-H-K- 
Consensus pattern: K-F-G-x-G-D-G- 
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[ 1] Rudnick D.A., McWherter C.A., Gokel G.W., Gordon J.I. Adv. Enzymol. 67:375- 
430(1993). 

368. ADP-glucose pyrophosphorylase signatures (NTP_transferase) 

ADP-glucose pyrophosphorylase (glucose-l-phosphate adenylyltransferase) [1,2](EC 
2.7.7.27 ) catalyzes a very important step in the biosynthesis of alpha 1,4-glucans (glycogen 
or starch) in bacteria and plants: synthesis of the activated glucosyl donor, ADP-glucose, 
from glucose-l-phosphate and ATP. ADP-glucose pyrophosphorylase is a tetrameric 
allosterically regulated enzyme. It is a homotetramer in bacteria while in plant chloroplasts 
and amyloplasts, it is a heterotetramer of two different, yet evolutionary related, subunits. 
There are a number of conserved regions in the sequence of bacterial and plant ADP-glucose 
pyrophosphorylase subunits. Three of these regions were selected as signature patterns. The 
first two are N-terminal and have been proposed to be part of the allosteric and/or substrate- 
binding sites in the Escherichia coli enzyme (gene glgC). The third pattern corresponds to a 
conserved region in the central part of the enzymes. 

Consensus pattern: [AG]-G-G-x-G-[STK]-x-L-x(2)-L-[TA]-x(3)-A-x-P-A-[LV] - 

Consensus pattern: W-[FY]-x-G-[ST]-A-[DNSH]-[AS]-[LIVMFYW]- 

Consensus pattern: [APV]-[GS]-M-G-[LIVMN]-Y-[IVC]-[LIVMFY]-x(2)-[DENPHK] - 

[ 1] Nakata P.A., Greene T.W., Anderson J.M., Smith- White B.J., Okita T.W., Preiss J. Plant 
Mol. Biol. 17:1089-1093(1991). 

[ 2] Preiss J., Ball K., Hutney J., Smith-White B.J., Li. L., Okitsa T.W. Pure Appl. Chem. 
63:535-544(1991). 

369. Sodium/hydrogen exchanger family 

Na/H antiporters are key transporters in maintaining the 
pH of actively metabolizing cells. The molecular mechanisms 
of antiport are unclear. 

These antiporters contain 10-12 transmembrane regions (M) at the 
amino-terminus and a large cytoplasmic region at the carboxyl 
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terminus. The transmembrane regions M3-M12 share identity with 
other members of the family. The M6 and M7 regions are highly 
conserved. Thus, this is thought to be the region that is involved 
in the transport of sodium and hydrogen ions. The cytoplasmic 
region has little similarity throughout the family. 

[1] Dibrov P, Fliegel L; FEES Lett 1998;424:1-5. [2] Orlowski J, Grinstein S; J Biol 
Chem 1997;272:22373-22376.[3] Numata M, Petrecca K, Lake N, Orlowski J; J Biol Cham 
1998;273:6951-6959. 

370. Sodium: sulfate symporter family signature (Na_sulph_symp) 

Integral membrane proteins that mediate the intake of a wide variety of molecules with the 
concomitant uptake of sodium ions (sodium symporters) canbe grouped, on the basis of 
sequence and functional similarities into a number of distinct families. One of these families 
currently consists of the following proteins: - Mammalian sodium/sulfate cotransporter [1]. - 
Mammalian renal sodium/dicarboxylate cotransporter [2], which transports succinate and 
citrate. - Mammalian intestinal sodium/dicarboxylate cotransporter. - Chlamydomonas 
reinhardtii putative sulfur deprivation response regulator SACl [3]. - Caenorhabditis elegans 
hypothetical proteins B0285.6, F31F6.6, K08E5.2 and R107.1. - Escherichia coli hypothetical 
protein yfbS. - Haemophilus influenzae hypothetical protein HI0608. - Synechocystis strain 
PCC 6803 hypothetical protein sll0640. - Methanococcus jannaschii hypothetical protein 
MJ0672.These transporters are proteins of from 430 to 620 amino acids which are highly 
hydrophobic and which probably contain about 12 transmembrane regions. As a signature 
pattern, a conserved region was selected which is located in or near the penultimate 
transmembrane region. 

Consensus pattern: [STACP]-S-x(2)-F-x(2)-P-[LIVM]-[GSA]-x(3)-N-x-[LIVM]-V- 

[ 1] Markovich D., Forgo J., Stange G., Biber J., Murer H. Proc. Natl. Acad. Sci. U.S.A. 
90:8073-8077(1993). 

[ 2] Pajor A.M. Am. J. Physiol. 270:642-648(1996). 

[ 3] Davies J.P., Yildiz F.H., Grossman A. EMBO J. 15:2150-2159(1996). 
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371. NifU-like domain 

This is an alignment of the carboxy-terminal domain. This is the only common region 
between the NifU protein from nitrogen-fixing bacteria and rhodobacterial species. The 
biochemical function of NifU is unknown [1]. 

Ouzounis C, Bork P, Sander C, Trends Biochem Sci 1994;19:199-200. 

372. Nitrilases / cyanide hydratase signatures 

Nitrilases (EC 3.5.5.1 ) are enzymes that convert nitriles into their corresponding acids and 
ammonia. They are widespread in microbes as well as in plants where they convert indole-3- 
acetonitrile to the hormone indole-3-acetic acid. A conserved cysteine has been shown [1,2] 
to be essential for enzyme activity; it seems to be involved in a nucleophilic attack on the 
nitrile carbon atom. Cyanide hydratase (EC 4.2.1.66 ) converts HCN to formamide. In 
phytopathogenic fungi, it is used to avoid the toxic effect of cyanide released by wounded 
plants [3]. The sequence of cyanide hydrolase is evolutionary related to that of nitrilases. 
Yeast hypothetical proteins YIL164c and YIL165c also belong to this family. As signature 
patterns for these enzymes, two conserved regions were selected. The first is located in the N- 
terminal section while the second, which contains the active site cysteine, is located in the 
central section. 

Consensus pattern: G-x(2)-[LIVMFY](2)-x-[IF]-x-E-x(2)-[LIVM]-x-G-Y-P- 

Consensus pattern: G-[GAQ]-x(2)-C-[WA]-E-[NH]-x(2)-[PST]-[LIVMFYS]-x-[KR] [C is 

the active site residue] - 

[ 1] Kobayashi M., Izui H., Nagasawa T., Yamada H. Proc. Natl. Acad. Sci. U.S.A. 90:247- 
251(1993). 

[ 2] Kobayashi M., Komeda H., Yanaka N., Nagasawa T., Yamada H. J. Biol. Chem. 
267:20746-20751(1992). 

[ 3] Wang P., Vanetten H.D. Biochem. Biophys. Res. Commun. 187:1048-1054(1992). 
373. NusB family 
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The NusB protein is involved in the regulation of rRNA biosynthesis by 
transcriptional antitermination. 

Huenges M, Rolz C, Gschwind R, Peteranderl R, Berglechner F, Richter G, Bacher A, 
Kessler H,Gemmecker G, EMBO J 1998;17:4092-4100. 

374. (Neur Chan) Neurotransmitter-gated ion-channels signature 

Neurotransmitter-gated ion-channels [1,2,3,4] provide the molecular basis for rapid signal 
transmission at chemical synapses. They are post-synapticoligomeric transmembrane 
complexes that transiently form a ionic channel upon the binding of a specific 
neurotransmitter. Presently, the sequence of subunits from five types of neurotransmitter- 
gated receptors are known: - The nicotinic acetylcholine receptor (AchR), an excitatory 
cation channel. In the motor endplates of vertebrates, it is composed of four different subunits 
(alpha, beta, gamma and delta or epsilon) with a molar stoichiometry of 2:1:1:1. In neurones, 
the AchR receptor is composed of two different types of subunits: alpha and non-alpha (also 
called beta). Nicotinic AchRs are also found in invertebrates. - The glycine receptor, an 
inhibitory chloride ion channel. The glycine receptor is a pentamer composed of two different 
subunits (alpha and beta). - The gamma-aminobutyric-acid (GABA) receptor, which is also 
an inhibitory chloride ion channel. The quaternary structure of the GABA receptor is 
complex; at least four classes of subunits are known to exist (alpha, beta, gamma, and delta) 
and there are many variants in each class (for example: six variants of the alpha class have 
already been sequenced). - The serotonin 5HT3 receptor. Serotonin is a biogenic hormone 
that functions as a neurotransmitter, a hormone and a mitogen. There are seven major groups 
of serotonin receptors; six of these groups (5HT1, 5HT2, and 5HT4 to 5HT7) transduce 
extracellular signal by activating G proteins, while 5HT3 is a ligand-gated cation-specific ion 
channel which, when activated causes fast, depolarizing responses in neurons. - The 
glutamate receptor, an excitatory cation channel. Glutamate is the main excitatory 
neurotransmitter in the brain. At least three different types of glutamate receptors have been 
described and are named according to their selective agonists (kainate, N-methyl-D-aspartate 
(NMDA) and quisqualate) .All known sequences of subunits from neurotransmitter-gated ion- 
channels are structurally related. They are composed of a large extracellular glycosylated N- 
terminal ligand-binding domain, followed by three hydrophobic transmembrane regions 
which form the ionic channel, followed by an intracellular region of variable length. A fourth 
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hydrophobic region is found at the C-terminal of the sequence. The sequence of subunits 
from the AchR, GABA, 5HT3, and Gly receptors are clearly evolutionary related and share 
many regions of sequence similarities. These sequence similarities are either absent or very 
weak in the Glu receptors. In the N-terminal extracellular domain of AchR/GABA/5HT3/Gly 
receptors, there are two conserved cysteine residues, which, in AchR, have been shown to 
form a disulfide bond essential to the tertiary structure of the receptor. A number of amino 
acids between the two disulfide-bonded cysteines are also conserved. Therefore this region 
was used as a signature pattern for this subclass of proteins. 

Consensus pattern: C-x-[LIVMFQ]-x-[LIVMF]-x(2)-[FY]-P-x-D-x(3)-C [The two Cs are 
linked by a disulfide bond]- 

[ 1] Stroud R.M., McCarthy M.P., Shuster M. Biochemistry 29:11009-11023(1990). 
[ 2] Betz H. Neuron 5:383-392(1990). 

[ 3] Dingledine R., Myers S.J., Nicholas R.A. FASEB J. 4:2632-2645(1990). 
[ 4] Barnard E.A. Trends Biochem. Sci. 17:368-374(1992). 

375. Orotidine 5'-phosphate decarboxylase active site 

Orotidine 5'-phosphate decarboxylase (EC 4.1.1.23 ) (OMPdecase) [1,2] catalyzes the last step 
in the de novo biosynthesis of pyrimidines, the decarboxylation of OMP into UMP. In higher 
eukaryotes OMPdecase is part, with orotatephosphoribosyltransferase, of a bifunctional 
enzyme, while the prokaryotic and fungal OMPdecases are monofunctional protein. Some 
parts of the sequence of OMPdecase are well conserved across species. The best conserved 
region is located in the N-terminal half of OMPdecases and is centered around a lysine 
residue which is essential for the catalytic function of the enzyme. This region has been 
developed as a signature pattern. 

Consensus pattern: [LIVMFTA]-[LIVMF]-x-D-x-K-x(2)-D-I-[GP]-x-T-[LIVMTA] [K is the 
active site residue] - 

[ 1] Jacquet M., Guilbaud R., Garreau H. Mol. Gen. Genet. 211:441-445(1988). 
[ 2] Kimsey H.H., Kaiser D. J. Biol. Chem. 267:819-824(1992). 
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376. ATP synthase delta (OS CP) subunit signature 

ATP synthase (proton-translocating ATPase) (EC 3.6.1.34) [1,2] is a component 
5 of the cytoplasmic membrane of eubacteria, the inner membrane of mitochondria, 
and the thylakoid membrane of chloroplasts. The ATPase complex is composed of 
an oligomeric transmembrane sector, called CF(0), which acts as a proton 
channel, and a catalytic core, termed coupling factor CF(1). 
One of the subunits of the ATPase complex, known as subunit delta in bacteria 
1 0 and chloroplasts or the Oligomycin Sensitivity Conferral Protein (OSCP) in 
mitochondria, seems to be part of the stalk that links CF(0) to CF(1). It 
either transmits conformational changes from CF(0) into CF(1) or is involved 
in proton conduction [3]. 

The different delta/OSCP subunits are proteins of approximately 200 amino-acid 
1 5 residues - once the transit peptide has been removed in the chloroplast and 
mitochondrial forms - which show only moderate sequence homology. 
The signature pattern used to detect ATPase delta/OSCP subunits is based on a 
conserved region in the C-terminal section of these proteins. 

2 0 Consensus pattern: [LIVM]-x-[LIVMFYT]-x(3)-[LIVMT]-[DENQK]-x(2)-[LIVM]-x- 

[GSA]-G-[LIVMFYGA]-x-[LIVM]-[KRHENQ]-x-[GSEN] 

[ 1] Futai M., Noumi T., Maeda M. Annu. Rev. Biochem. 58:111-136(1989). 
[ 2] Senior A.E. Physiol. Rev. 68:177-231(1988). 
25 [3] Engelbrecht S., Junge W. Biochim. Biophys. Acta 1015:379-390(1990). 

377. Aspartate and ornithine carbamoyltransferases signature 

Aspartate carbamoyltransf erase (EC 2.1.3.2) (ATCase) catalyzes the conversion 

3 0 of aspartate and carbamoyl phosphate to carbamoylaspartate, the second step 

inthede novo biosynthesis of pyrimidine nucleotides [1]. In prokaryotes 
ATCase consists of two subunits: a catalytic chain (gene pyrB) and a 
regulatory chain (gene pyri), while in eukaryotes it is a domain in a multi- 
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functional enzyme (called URA2 in yeast, rudimentary in Drosophila, and CAD 
in mammals [2]) that also catalyzes other steps of the biosynthesis of 
pyrimidines. 

Ornithine carbamoyltransf erase (EC 2.1.3.3) (OTCase) catalyzes the conversion 
of ornithine and carbamoyl phosphate to citrulline. In mammals this enzyme 
participates in the urea cycle [3] and is located in the mitochondrial 
matrix. In prokaryotes and eukaryotic microorganisms it is involved in the 
biosynthesis of arginine. In some bacterial species it is also involved in the 
degradation of arginine [4] (the arginine deaminase pathway). 
It has been shown [5] that these two enzymes are evolutionary related. The 
predicted secondary structure of both enzymes are similar and there are some 
regions of sequence similarities. One of these regions includes three 
residues which have been shown, by crystallographic studies [6], to be 
implicated in binding the phosphoryl group of carbamoyl phosphate. 
This region was selected as a signature for these enzymes. 

Consensus pattern: F-x-[EK]-x-S-[GT]-R-T[S, R, and the 2nd T bind carbamoyl phosphate] 
-Note: the residue in position 3 of the pattern allows to distinguish between 
an ATCase (Glu) and an OTCase (Lys). 

[ 1] Lerner CO., Switzer R.L. J. Biol. Chem. 261:11156-11165(1986). 

[ 2] Davidson J.N., Chen K.C., Jamison R.S., Musmanno L.A., Kern C.B. BioEssays 

15:157-164(1993). 

[ 3] Takiguchi M., Matsubasa T., Amaya Y., Mori M. BioEssays 10:163-166(1989). 
[ 4] Baur H., Stalon V., Falmagne P., Luethi E., Haas D. Eur. J. Biochem. 166:111- 
117(1987). 

[ 5] Houghton I.E., Bencini D.A., O'Donovan G.A., Wild J.R. Proc. Natl. Acad. Sci. U.S.A. 
81:4864-4868(1981). 

[ 6] Ke H.-M., Honzatko R.B., Lipscomb W.N. Proc. Natl. Acad. Sci. U.S.A. 81:4037- 
4040(1984). 

378. Oleosins signature 
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Oleosins [1] are the proteinaceous components of plants' lipid storage bodies 
called oil bodies. Oil bodies are small droplets (0.2 to 1.5 mu-m in diameter) 
containing mostly triacylglycerol that are surrounded by a phospholipid/ 
oleosin annulus. Oleosins may have a structural role in stabilizing the lipid 
body during dessication of the seed, by preventing coalescence of the oil. 
They may also provide recognition signals for specific lipase anchorage in 
lipolysis during seedling growth. Oleosins are found in the monolayer lipid/ 
water interface of oil bodies and probably interact with both the lipid and 
phospholipid moieties. 

Oleosins are proteins of 16 Kd to 24 Kd and are composed of three domains: an 
N-terminal hydrophilic region of variable length (from 30 to 60 residues); a 
central hydrophobic domain of about 70 residues and a C-terminal amphipathic 
region of variable length (from 60 to 100 residues). The central hydrophobic 
domain is proposed to be made up of beta-strand structure and to interact with 
the lipids [2]. It is the only domain whose sequence is conserved and therefore 
a section from that domain was selected as a signature pattern. 

Consensus pattern: [AG]-[ST]-x(2)-[AG]-x(2)-[LIVM]-[SAD]-T-P-[LIVMF](4)-F-S-P- 
[LIVM](3)-P-A 

[ 1] Murphy D.J., Keen J.N., O'Sullivan J.N., Au D.M.Y., Edwards E.-W., Jackson P.J., 
Cummins I., Gibbons T., Shaw C.H., Ryan A.J. Biochim. Biophys. Acta 1088:86-94(1991). 
[ 2] Tzen J.T.C., Lie G.C., Huang A.H.C. J. Biol. Chem. 267:15626-15634(1992). 

379. (Orbi VP5) Orbivirus outer capsid protein VP5 

This paper shows the location of the different capsid proteins 
and their relation to each other. 

[1] Schoehn G, Moss SR, Nuttall PA, Hewat EA; Virology 1997;235:191-200. 
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380. Orn/DAP/Arg decarboxylases family 2 signatures 

Pyridoxal-dependent decarboxylases acting on ornithine, lysine, arginine and 
related substrates can be classified into two different families on the basis 
of sequence similarities [1,2,3]. The second family consists of: 

- Eukaryotic ornithine decarboxylase (EC 4.1.1.17) (ODC). ODC catalyzes the 
transformation of ornithine into putrescine. 

- Prokaryotic diaminopimelic acid decarboxylase (EC 4.1.1.20) (DAPDC). DAPDC 
catalyzes the conversion of diaminopimelic acid into lysine; the last step 

in the biosynthesis of lysine. 

- Pseudomonas syringae pv. tabaci protein tabA. tabA is probably involved in 
the biosynthesis of tabtoxin and is highly similar to DAPDC. 

-Bacterial and plant biosynthetic arginine decarboxylase (EC 4.1.1.19) 
(ADC). ADC catalyzes the transformation of arginine into agmatine, the 
first step in the biosynthesis of putrescine from arginine. 
The above proteins, while most probably evolutionary related, do not share 
extensive regions of sequence similarities. Two of the conserved regions were 
selected as signature patterns. The first pattern contains a conserved lysine 
residue which is known, in mouse ODC [4], to be the site of attachment of the 
pyridoxal-phosphate group. The second pattern contains a stretch of three 
consecutive glycine residues and has been proposed to be part of a substrate- 
binding region [5]. 

These enzymes are collectively known as group IV decarboxylases [3]. 

Consensus pattern: [FY]-[PA]-x-K-[SACV]-[NHCLFW]-x(4)-[LIVMF]-[LIVMTA]-x(2)- 
[LIVMA]-x(3)-[GTE] [K is the pyridoxal-P attachment site] 

Consensus pattern: [GS]-x(2,6)-[LIVMSCP]-x(2)-[LIVMF]-[DNS]-[LIVMCA]-G-G-G- 
[LIVMFY] - [GSTPCEQ] 

[ 1] Bairoch A. Unpublished observations (1993). 

[ 2] Martin C, Cami B., Yeh P., Stragier P., Parsot C, Patte J.-C. Mol. Biol. Evol. 5:549- 
559(1988). 

[ 3] Sandmeier E., Hale T.I., Christen P. Eur. J. Biochem. 221:997-1002(1994). 

[ 4] Poulin R., Lu L., Ackermann B., Bey P., Pegg A.E. J. Biol. Chem. 267:150-158(1992). 
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[ 5] Moore R.C., Boyle S.M. J. Bacteriol. 172:4631-4640(1990). 
381. Osteopontin signature 

Osteopontin is an acidic phospiiorylated glycoprotein of about 40 Kd which is 
abundant in the mineral matrix of bones and which binds tightly to 
hydroxyapatite [1,2,3]. It is suggested that osteopontin might function as a 
cell attachment factor and could play a key role in the adhesion of 
osteoclasts to the mineral matrix of bone. 

Osteopontin-K is a kidney protein which is highly similar to osteopontin and 
probably also involved in cell-adhesion. 

As a signature pattern a highly conserved region located at the 
N-terminal extremity of the mature protein was selected. 

Consensus pattern: [K0]-x-[TA]-x(2)-[GA]-S-S-E-E-K 

[ 1] Butler W.T. Connect. Tissue Res. 23:123-36(1989). 

[ 2] Gorski J.P. Calcif. Tissue Int. 50:391-396(1992). 

[ 3] Denhardt D.T., Guo X. FASEB J. 7:1475-1482(1993). 

382. Oxysterol-binding protein family signature 

A number of eukaryotic proteins that seem to be involved with sterol synthesis 
and/or its regulation have been found [1] to be evolutionary related: 

- Mammalian oxysterol-binding protein (OSBP). A protein of about 800 amino- 
acid residues that binds a variety of oxysterols: oxygenated derivatives of 
cholesterol. OSBP seems to play a complex role in the regulation of sterol 
metabolism. 

- Yeast proteins HESl and KESl; highly related proteins of 434 residues that 
seem to play a role in ergosterol synthesis. 

- Yeast OSHl, a protein of 859 residues that also plays a role in ergosterol 
synthesis. - Yeast hypothetical protein YHROOlw (437 residues). 

- Yeast hypothetical protein YHR073w (996 residues). 
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- Yeast hypothetical protein YKROOSw (448 residues). 

All these proteins contain a moderately conserved domain of about 250 residues 
located in the C-terminal half of OBSP, OSHl and YHR073w and in the central 
section of the other proteins. As a signature pattern, the best conserved part was 
selected of this domain, a region that contains a conserved 
pentapeptide. 

Consensus pattern: E-[KQ]-x-S-H-[HR]-P-P-x-[STACF]-A 

[ 1] Jiang B., Brown J.L., Sheraton J., Fortin N., Bussey H. Yeast 10:341-353(1994). 

383. FMN oxidoreductase 

384. Oxidoreductase FAD/NAD-binding domain 
Number of members: 250 

[1] 

Medline: 92084635 

The sequence of squash NADH: nitrate reductase and its 
relationship to the sequences of other flavoprotein 
oxidoreductases. A family of flavoprotein pyridine nucleotide 
cytochrome reductases. 
Hyde GE, Crawford NM, Campbell W; 

J Biol Chem 1991;266:23542-23547. 

[2]Medline: 95111952 
Crystal structure of the FAD-containing fragment of corn 
nitrate reductase at 2.5 A resolution: relationship to other 
flavoprotein reductases. 
Lu G, Campbell WH, Schneider G, Lindqvist Y; 

Structure 1994;2:809-821. 
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385. (oxidored molyb) Eukaryotic molybdopterin oxidoreductases signature 
A number of different eukaryotic oxidoreductases that require and bind a 
molybdopterin cofactor have been shown [1] to share a few regions of sequence 
similarity. These enzymes are: 

- Xanthine dehydrogenase (EC 1.1.1.204), which catalyzes the oxidation of 
xanthine to uric acid with the concomitant reduction of NAD. Structurally, 
this enzyme of about 1300 amino acids consists of at least three distinct 
domains: an N-terminal 2Fe-2S ferredoxin-like iron-sulfur binding domain 

(see <PDOC00175>), a central FAD/NAD-binding domain and a C-terminal Mo- 
pterin domain. 

- Aldehyde oxidase (EC 1.2.3.1), which catalyzes the oxidation aldehydes into 
acids. Aldehyde oxidase is highly similar to xanthine dehydrogenase in its 
sequence and domain structure. 

- Nitrate reductase (EC 1.6.6.1), which catalyzes the reduction of nitrate 
to nitrite. Structurally, this enzyme of about 900 amino acids consists of 

an N-terminal Mo-pterin domain, a central cytochrome b5-type heme-binding 
domain (see <PDOC00170>) and a C-terminal FAD/NAD-binding cytochrome 
reductase domain. 

- Sulfite oxidase (EC 1.8.3.1), which catalyzes the oxidation of sulfite to 
sulfate. Structurally, this enzyme of about 460 amino acids consists of an 
N-terminal cytochrome b5-binding domain followed by a Mo-pterin domain. 

There are a few conserved regions in the sequence of the molybdopterin-binding 
domain of these enzymes. The pattern used to detect these proteins is based 
on one of them. It contains a cysteine residue which could be involved in 
binding the molybdopterin cofactor. 

Consensus pattern: [GA]-x(3)-[KRNQHT]-x(ll,14)-[LIVMFYWS]-x(8)-[LIVMF]-x-C-x( 
[DEN]-R-x(2)-[DE] 



[ 1] Wootton J.C., Nicolson R.E., Cock J.M., Walters D.E., Burke J.F., Doyle W.A., Bray 
R.C. Biochim. Biophys. Acta 1057:157-185(1991). 
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386. (Oxidored ql) NADH-Ubiquinone/plastoquinone (complex I), various chains 
This family is part of complex I which catalyses the 

transfer of two electrons from NADH to ubiquinone in a 
reaction that is associated with proton translocation 
across the membrane. Number of members: 1824 
[1] 

Medline: 93110040 

The NADH:ubiquinone oxidoreductase (complex I) of respiratory chains. Walker JE; 
Q Rev Biophys 1992;25:253-324. 

387. (oxidored q3) NADH-ubiquinone/plastoquinone oxidoreductase chain 6. 179 members. 

388. (oxidored q5) NADH-ubiquinone oxidoreductase chain 4, amino terminus 
[1] Walker JE ; Q Rev Biophys 1992;25:253-324. 

389. (oxidored q6) Respiratory-chain NADH dehydrogenase 20 Kd subunit signature 
Respiratory-chain NADH dehydrogenase (EC 1.6.5.3) [1,2] (also known as complex 
I or NADH-ubiquinone oxidoreductase) is an oligomeric enzymatic complex 
located in the inner mitochondrial membrane which also seems to exist in 

the chloroplast and in cyanobacteria (as a NADH-plastoquinone oxidoreductase). 
Among the 25 to 30 polypeptide subunits of this bioenergetic enzyme complex 
there is one with a molecular weight of 20 Kd (in mammals) [3], which is a 
component of the iron-sulfur (IP) fragment of the enzyme. It seems to bind a 
4Fe-4S iron-sulfur cluster. The 20 Kd subunit has been found to be: 

- Nuclear encoded, as a precursor form with a transit peptide in mammals, and 
in Neurospora crassa. - Mitochondrial encoded in Paramecium (gene psbG). 

- Chloroplast encoded in various higher plants (gene ndhK or psbG). 
The 20 Kd subunit is highly similar to [4]: 

- Synechocystis strain PCC 6803 proteins psbGl and psbG2. 
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- Subunit B of Escherichia coli NADH-ubiquinone oxidoreductase (gene nuoB). 

- Subunit NQ06 of Paracoccus denitrificans NADH-ubiquinone oxidoreductase. 

- Subunit 7 of Escherichia coli formate hydrogenlyase (gene hycG). 

- Subunit I of Escherichia coli hydrogenase-4 (gene hyfl). 

As as signature pattern a highly conserved region was selected, located in the 
central section of this subunit and which contains a conserved cysteine that 
is probably involved in the binding of the 4Fe-4S center. 

Consensus pattern: [GN]-x-D-[KRST]-[LIVMF](2)-P-[IV]-D-[LIVMFYW](2)-x-P-x-C-P- 

[PT] [The C is a putative 4Fe-4S ligand] 

[ 1] Ragan C.I. Curr. Top. Bioenerg. 15:1-36(1987). 

[ 2] Weiss H., Friedrich T., Hofhaus G., Preis D. Eur. J. Biochem. 197:563-576(1991). 
[ 3] Arizmendi J.M., Runswick M.J., Skehel J.M., Walker J.E. FEES Ixtt. 301:237- 
242(1992). 

[ 4] Weidner U., Geier S., Ptock A., Friedrich T., Leif H., Weiss H. J. Mol. Biol. 233:109- 
122(1993). 

390. p53 tumor antigen signature 

The p53 tumor antigen [1 to 5, E1,E2] is a protein found in increased amounts 
in a wide variety of transformed cells. It is also detectable in many 
proliferating nontransformed cells, but it is undetectable or present at low 
levels in resting cells. It is frequently mutated or inactivated in many types 
of cancer. p53 seems to act as a tumor suppressor in some, but probably not 
all, tumor types. p53 is probably involved in cell cycle regulation, and may 
be a trans-activator that acts to negatively regulate cellular division by 
controlling a set of genes required for this process. 

p53 is a phosphoprotein of about 390 amino acids which can be subdivided into 
four domains: a highly charged acidic region of about 75 to 80 residues, a 
hydrophobic proline-rich domain (position 80 to 150), a central region (from 
150 to about 300), and a highly basic C-terminal region. The sequence of p53 
is well conserved in vertebrate species; attempts to identify p53 in other 
eukaryotic philum has so far been unsuccessful. 
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As a signature pattern for p53 a perfectly conserved stretch of 13 

residues located in the central region of the protein was selected. This region, known as 
domain IV in [3], is involved (along with an adjacent region) in the binding 
of the large T antigen of SV40. In man this region is the focus of a variety 
of point mutations in cancerous tumors. 

Consensus pattern: M-C-N-S-S-C-M-G-G-M-N-R-R 

[ 1] Levine A.J., Momand J., Finlay C.A. Nature 351:453-456(1991). 

[ 2] Levine A.J., Momand J. Biochim. Biophys. Acta 1032:119-136(1990). 

[ 3] Soussi T., Caron De Fromentel C, May P. Oncogene 5:945-952(1990). 

[ 4] Lane D.P., Benchimol S. Genes Dev. 4:1-8(1990). 

[ 5] Ulrich S.J., Anderson C.W., Mercer W.E., Appella E. J. Biol. Chem. 267:15259- 
15262(1992). 

391. (P5CR) Delta l-pyiroline-5-carboxylate reductase signature 

Delta l-pyrroline-5-carboxylate reductase (P5CR) (EC 1.5.1.2) [1,2] is the 

enzyme that catalyzes the terminal step in the biosynthesis of proline from 

glutamate, the NAD(P) dependent oxidation of l-pyrroline-5-carboxylate into 

proline. 

The sequences of P5CR from eubacteria (gene proC), archaebacteria and 
eukaryotes show only a moderate level of overall similarity. As a signature 
pattern, the best conserved region located in the C-terminal 
section of P5CR was selected. 

Consensus pattern: [PALF]-x(2,3)-[LIV]-x(3)-[LIVM]-[STAC]-[STV]-x-[GAN]-G-x-T- 
[AG]-[LIV]-x(2)-[LMF]-[DENQK] 

[ 1] Delauney A.J., Verma D.P. Mol. Gen. Genet. 221:299-305(1990). 

[ 2] Savioz A., Jeenes D.J., Kocher H.P., Haas D. Gene 86:107-111(1990). 

392. Poly-adenylate binding protein, unique domain. 
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393. (PAL) Phenylalanine and histidine ammonia-lyases active site 
Phenylalanine ammonia-lyase (EC 4.3.1.5) (PAL) is a key enzyme of plant and 
fungi phenylpropanoid metabolism which is involved in the biosynthesis of a 
wide variety of secondary metabolites such as flavanoids, furanocoumarin 
phytoalexins and cell wall components. These compounds have many important 
roles in plants during normal growth and in responses to environmental stress. 
PAL catalyzes the removal of an ammonia group from phenylalanine to form 
trans-cinnamate. 

Histidine ammonia-lyase (EC 4.3.1.3) (histidase) catalyzes the first step in 
histidine degradation, the removal of an ammonia group from histidine to 
produce urocanic acid. 

The two types of enzymes are functionally and structurally related [1]. They 
are the only enzymes which are known to have the modified amino acid dehydro- 
alanine (DHA) in their active site. A serine residue has been shown [2,3,4] to 
be the precursor of this essential electrophilic moiety. The region around 
this active site residue is well conserved and can be used as a signature 
pattern. 

Consensus pattern: G-[STG]-[LIVM]-[STG]-[AC]-S-G-[DH]-L-x-P-L-[SA]-x(2)-[SA] [S is 
the active site residue] 

[ 1] Taylor R.G., Lambert M.A., Sexsmith E., Sadler S.J., Ray P.N., Mahuran D.J., Mclnnes 
R.R. J. Biol. Chem. 265:18192-18199(1990). 

[ 2] Langer M., Reck G., Reed J., Retey J. Biochemistry 33:6462-6467(1994). 

[ 3] Schuster B., Retey J. FEES Lett. 349:252-254(1994). 

[ 4] Taylor R.G., Mclnnes R.R. J. Biol. Chem. 269:27473-27477(1994). 



394. PAS domain 
-!- CAUTION. This family does not currently match all known 
examples of PAS domains. 
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PAS motifs appear in archaea, eubacteria and eukarya. Probably 
the most surprising identification of a PAS domain was that in 
EAG-like K+-channels[l,3]- 
Number of members: 308 
[1] 

Medline: 97446881 

PAS domain S-boxes in archaea, bacteria and sensors for 
oxygen and redox. 
Zhulin IB, Taylor BL, Dixon R; 
Trends Biochem Sci 1997;22:331-333. 
[2]Medline: 95275818 
1.4 A structure of photoactive yellow protein, a cytosolic 
photoreceptor: unusual fold, active site, and chromophore. 
Borgstahl GE, Williams DR, Getzoff ED; 
Biochemistry 1995;34:6278-6287. 
[3]Medline: 98044337 
PAS: a multifunctional domain family comes to light. 
Ponting CP, Aravind L; 
Curr Biol 1997;7:674-677. 

395. (PBP) Phosphatidylethanolamine-binding protein family signature 
Mammalian phosphatidylethanolamine-binding protein (also knowns as basic 
cytosolic 21 Kd protein) is a 186 residue protein found in a variety of 
tissues [1]. It binds hydrophobic ligands, such as phosphatidylethanolamine, 
but also seems [2] to bind nucleotides such as GTP and FMN, it is suggested 
that it could act in membrane remodeling during growth and maturation. This 
protein belongs to a family that also includes: 

- Drosophila antennal protein A5, a putative odorant-binding protein. 

- Onchocerca volvulus antigen Ov-16 and the related proteins Dl, D2 and D3. 

- Plasmodium falciparum putative phosphatidylethanolamine-binding protein. 

- Toxocara canis secreted antigen TES-26. This larval protein has been shown 
to bind phosphatidylethanolamine. 
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-Yeast protein DKAl (also known as NSPl or TFSl). The function of this 
protein is not very clear. - Yeast hypothetical protein YLR179C. 

- Caenorhabditis elegans hypothetical protein F40A3.3. 
As a signature pattern, the best conserved region was selected which is located 
in the end of the first third of the sequence of these proteins. 

Consensus pattern: [FYL]-x-[LV]-[LIVF]-x-[TIV]-[DC]-P-D-x-P-[SN]-x(10)-H 

[ 1] Seddiqi N., Bollengier F., Alliel P.M., Perin J.P., Bonnet F., Bucquoy S., Jolles P., 

Schoentgen F. J. Mol. Evol. 39:655-660(1994). 

[ 2] Schoentgen F., Jolles P. FEES Lett. 369:22-6(1995). 

396. PCI domain 

This domain has also been called the PINT motif (Proteasome, 
Int-6, Nip-1 and TRIP-15) [1]. 
Number of members: 49 
[1] 

Medline: 98308842 

The PCI domain: a common theme in three multiprotein 
complexes. 
Hofmann K, Bucher P; 
Trends Biochem Sci 1998;23:204-205. 
[2]Medline: 98266368 
Homologues of 26S proteasome subunits are regulators of 
transcription and translation. 
Aiavind L, Pouting CP; 
Protein Sci 1998;7:1250-1254. 

397. (PCMT) Protein-L-isoaspartate (D-aspartate) O-methyltransferase signature. Protein-L- 
isoaspartate (D-aspartate) O-methyltransferase (EC 2.1.1.77 ) (PCMT)[1] (which is also 
known as L-isoaspartyl protein carboxyl methyltransferase)is an enzyme that catalyzes the 
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transfer of a methyl group from S-adenosylmethionine to the free carboxyl groups of D- 
aspartyl or L-isoaspartyl residues in a variety of peptides and proteins. The enzyme does not 
act on normal L-aspartyl residues L-isoaspartyl and D-aspartyl are the products of the 
spontaneous de amidation and/or isomerization of normal L-aspartyl and L-asparaginyl 
residues in proteins. PCMT plays a role in the repair and/ordegradation of these damaged 
proteins; the enzymatic methyl esterification of the abnormal residues can lead to their 
conversion to normal L-aspartylresidues. PCMT is a vv'cll-conserved and widely distributed 
cytosolic protein of about 24Kd. As a signature pattern, a conserved region in the central part 
of this enzyme has been developed. 

Consensus pattern: [GSA]-D-G-x(2)-G-[FYWV]-x(3)-[AS]-P-[FY]-[DN]-x-I - 

[ 1] Kagan R.M., McFadden H.J., McFadden P.N., O'Connor C, Clarke S. Comp. Biochem. 
Physiol. 117b:379-385(1997). 

398. (PCNA) Proliferating cell nuclear antigen signatures 

Proliferating cell nuclear antigen (PCNA) [1,2] is a protein involved in DNA 

replication by acting as a cofactor for DNA polymerase delta, the 

polymerase responsible for leading strand DNA replication. 

A similar protein exists in yeast (gene POL30) [3] and is associated with 

polymerase III, the yeast analog of polymerase delta. In baculoviruses the 

ETL protein has been shown [4] to be highly related to PCNA and is probably 

associated with the viral encoded DNA polymerase. An homolog of PCNA is also 

found in archebacteria. 

As signatures for this family of proteins, two conserved regions were selected 
located in the N-terminal section. The second one has been proposed to bind 
DNA. 

Consensus pattern: [GA]-[LIVMF]-x-[LIVMA]-x-[SAV]-[LIVM]-D-x-[NSAE]-[HKR]-[VI]- 
x-[LY]-[VGA]-x-[LIVM]-x-[LIVM]-x(4)-F 

-Consensus pattern: [RKA]-C-[DE]-[RH]-x(3)-[LIVMF]-x(3)-[LIVM]-x-[SGAN]-[LIVMF]- 
x-K-[LIVMF](2) 
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[ 1] Bravo R., Frank R., Blundell P.A., McDonald-Bravo H. Nature 326:515-517(1987). 
[ 2] Suzuka I., Hata S., Matsuoka M., Kosugi S., Hashimoto J. Eur. J. Biochem. 195:571- 
575(1991).[ 3] Bauer G.A., Burgess P.M.J. Nucleic Acids Res. 18:261-265(1990). 
[ 4] O'Reilly D.R., Crawford A.M., Miller L.K. Nature 337:606-606(1989). 



399. (PDT) Prephenate dehydratase signatures 

Prephenate dehydratase (EC 4.2.1.51) (PDT) catalyzes the decarboxylation of 
prephenate into phenylpyruvate. In microorganisms PDT is involved in the 
terminal pathway of the biosynthesis of phenylalanine. In some bacteria such 
as Escherichia coli PDT is part of a bifunctional enzyme (P -protein) that also 
catalyzes the transformation of chorismate into prephenate (chorismate 
mutase) while in other bacteria it is a monofunctional enzyme. The sequence of 
monofunctional PDT align well with the C-terminal part of that of P-proteins 
[!]• 

As signature patterns for PDT two conserved regions were selected. The first 
region contains a conserved threonine which has been said to be essential for 
the activity of the enzyme in E. coli. The second region includes a conserved 
glutamate. Both regions are in the C-terminal part of PDT. 

Consensus pattern: [FY] -x- [LIVM] -x(2)-[LIVM]-x(5)-[DN] -x(5)-T-R-F-[LI VMW] -x- 
[LIVM] 

[ 1] Fischer R.S., Zhao G., Jensen R.A. J. Gen. Microbiol. 137:1293-1301(1991). 



400. PDZ domain (Also known as DHR or GLGF). 

PDZ domains are found in diverse signaling proteins. 

[1] Pouting CP, Phillips C, Davies KE, Blake DJ 
Bioessays 1997;19:469-479. [2] Doyle DA, Lee A, Lewis J, Kim E, Sheng M, MacKinnon R; 
Cell. 1996;85:1067-1076. [3] Ponting CP; Protein Sci 1997;6:464-468. 
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401. (PPDK_N_term) PEP-utilizing enzymes signatures 
A number of enzymes that catalyze the transfer of a phosphoryl group from 
phosphoenolpyruvate (PEP) via a phospho-histidine intermediate have been shown 
to be structurally related [1,2,3,4]. These enzymes are: 

- Pyruvate,orthophosphate dikinase (EC 2.7.9.1) (PPDK). PPDK catalyzes the 
reversible phosphorylation of pyruvate and phosphate by ATP to PEP and 
diphosphate. In plants PPDK function in the direction of the formation of 
PEP, which is the primary acceptor of carbon dioxide in C4 and crassulacean 
acid metabolism plants. In some bacteria, such as Bacteroides symbiosus, 
PPDK functions in the direction of ATP synthesis. 

- Phosphoenolpyruvate synthase (EC 2.7.9.2) (pyruvate,water dikinase). This 
enzyme catalyzes the reversible phosphorylation of pyruvate by ATP to form 
PEP, AMP and phosphate, an essential step in gluconeogenesis when pyruvate 
and lactate are used as a carbon source. 

- Phosphoenolpyruvate-protein phosphotransferase (EC 2.7.3.9). This is the 
first enzyme of the phosphoenolpyruvate-dependent sugar phosphotransferase 
system (PTS), a major carbohydrate transport system in bacteria. The PTS 
catalyzes the phosphorylation of incoming sugar substrates concomitant 
with their translocation across the cell membrane. The general mechanism 
of the PTS is the following: a phosphoryl group from PEP is transferred 

to enzyme-I (EI) of PTS which in turn transfers it to a phosphoryl carrier 
protein (HPr). Phospho-HPr then transfers the phosphoryl group to a sugar- 
specific permease. 

All these enzymes share the same catalytic mechanism: they bind PEP and 
transfer the phosphoryl group from it to a histidine residue. The sequence 
around that residue is highly conserved and can be used as a signature pattern 
for these enzymes. As a second signature pattern a conserved 

region was selected in the C-terminal part of the PEP-utilizing enzymes. The biological 
significance of this region is not yet known. 



Consensus pattern: G-[GA]-x-[TN]-x-H-[STA]-[STAV]-[LIVM](2)-[STAV]-[RG] [H is 
phosphorylated] 



Reference No. 



2750-942P 



371 

-Consensus pattern: [DEQSK]-x-[LIVMF]-S-[LIVMF]-G-[ST]-N-D-[LIVM]-x-Q- 
[LIVMFYGT]-[STALIV]-[LIVMF]-[GAS]-x(2)-R 

[ 1] Reizer J., Hoischen C, Reizer A., Pham T.N., Saier M.H. Jr. Protein Sci. 2:506- 
521(1993). 

[ 2] Reizer J., Reizer A., Merrick M.J., Plunkett G. IIL Rose D.J., Saier M.H. Jr. Gene 
181:103-108(1996). 

[ 3] Pocalyko D.J., Carroll L.J., Martin B.M., Babbitt P.C., Dunaway-Mariano D. 
Biochemistry 29:10757-10765(1990). 

[ 4] Niersbach M., Kreuzaler F., Geerse R.H., Postma P., Hirsch H.J. Mol. Gen. Genet. 
232:332-336(1992). 

402. (PEPCK ATP) Phosphoenolpyruvate carboxykinase (ATP) signature 
Phosphoenolpyruvate carboxykinase (ATP) (EC 4.1.1.49) (PEPCK) [1] catalyzes 
the formation of phosphoenolpyruvate by decarboxylation of oxaloacetate while 
hydrolyzing ATP, a rate limiting step in gluconeogenesis (the biosynthesis of 
glucose). 

The sequence of this enzyme has been obtained from Escherichia coli, yeast, 
and Trypanosoma brucei; these three sequences are evolutionary related and 
share many regions of similarity. As a signature pattern a highly 

conserved region was selected that contains four acidic residues and which is located in 
the central part of the enzyme. The beginning of the pattern is located about 
10 residues to the C-terminus of an ATP -binding motif A' (P-loop) (see 
<PDOC00017>) and is also part of the ATP-binding domain [2]. 

Consensus pattern: L-I-G-D-D-E-H-x-W-x-[DE]-x-G-[IV]-x-N 

-Note: phosphoenolpyruvate carboxykinase (GTP) (EC 4.1.1.32) an enzyme that catalyzes 
the same reaction, but using GTP instead of ATP, is not related to the above enzyme (see 
<PDOC00421>). 

[ 1] Medina V., Pontarollo R., Glaeske D., Tabel H., Goldie H. J. Bacteriol. 172:7151- 
7156(1990). 
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[ 2] Matte A., Goldie H., Sweet R.M., Delbaere L.T.J. J. Mol. Biol. 256:126-143(1996). 

403. (Pepcase) Phosphoenolpyruvate carboxylase active sites. Phosphoenolpyruvate 
carboxylase (EC 4.1.1.31) (PEPcase) catalyzes the irreversible beta-carboxylation of 
phosphoenolpyruvate by bicarbonate to yield oxaloacetate and phosphate. The enzyme is 
found in all plants and in a variety of microorganisms. A histidine [1] and a lysine [2] have 
been implicated in the catalytic mechanism of this enzyme; the regions around these active 
site residues are highly conserved in PEPcase from various plants, bacteria and cyanobacteria 
and can be used as a signature patterns for this type of enzyme. 

Consensus pattern: [VT]-x-T-A-H-P-T-[EQ]-x(2)-R-[KRH] [H is an active site residue]- 
Consensus pattern: [IV]-M-[LIVM]-G-Y-S-D-S-x-K-D-[STAG]-G [K is an active site 
residue] - 

[ 1] Terada K., Izui K. Eur. J. Biochem. 202:797-803(1991).[ 2] Jiao J. -A., Podesta F.E., 
Chollet R., O'Leary M.H., Andreo C.S. Biochim. Biophys. Acta 1041:291-295(1990). 

404. PET112 family signature 

The following proteins from eukaryotes, prokaryotes and archaebacteria belong 
to the same family: 

- Yeast mitochondrial protein PET112 [1], which plays an unknown role in the 
expression of mitochondrial genes, probably at the level of translation. 

- Aspergillus nidulans mitochondrial protein nempA. 

- Bacillus subtilis hypothetical protein yzdD. 

- Moraxella catarrhalis hypothetical protein in bloR-1 3'region. 

- Mycoplasma genitalium hypothetical protein MGIOO. 

- Methanococcus jannaschii hypothetical proteins MJ0019 and MJ0160. 
The size of these proteins range from 419 to 630 amino acids. As a signature 
pattern, a conserved region located in the N-terminal section was selected. 



Consensus pattern: [DN]-x-[DN]-R-x(3)-P-L-[LIV]-E-[LIV]-x-[ST]-x-P 
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[ 1] Mulero J.J., Rosenthal J.K., Fox T.D. Curr. Genet. 25:299-304(1994). 
405. (PFK) Phosphofructokinase signature 

Phosphofructokinase (EC 2.7.1.11) (PFK) [1,2] is a key regulatory enzyme in 
the glycolytic pathway. It catalyzes the phosphorylation by ATP of fructose 
6-phosphate to fructose 1,6-bisphosphate. In bacteria PFK is a tetramer of 
identical 36 Kd subunits. In mammals it is a tetramer of 80 Kd subunits. Each 
80 Kd subunit consist of two homologous domains which are highly related to 
the bacterial 36 Kd subunits. In Human there are three, tissue-specific, types 
of PFK isozymes: PFKM (muscle), PFKL (liver), and PFKP (platelet). In yeast 
PFK is an octamer composed of four 100 Kd alpha chains (gene PFKl) and four 
100 Kd beta chains (gene PFK2); like the mammalian 80 Kd subunits, the yeast 
100 Kd subunits are composed of two homologous domains. 
As a signature pattern for PFK a region that contains three basic 
residues involved in fructose-6-phosphate binding was selected. 

Consensus pattern: [RK]-x(4)-G-H-x-Q-[QR]-G-G-x(5)-D-R [The R/K, the H and the Q/R 
are involved in fructose-6-P binding] 

-Note: Escherichia coli has two phosphofructokinase isozymes which are encoded by genes 
pfkA (major) and pfkB (minor). The pfkB isozyme is not evolutionary related to other 
prokaryotic or eukaryotic PFK's (see <PDOC00504>). 

[ 1] Poorman R.A., Randolph A., Kemp R.G., Heinrikson R.L. Nature 309:467-469(1984). 
[ 2] Heinisch J., Ritzel R.G., von Borstel R.C., Aguilera A., Rodicio R., Zimmermann F.K. 
Gene 78:309-321(1989). 

406. (PGAM) Phosphoglycerate mutase family phosphohistidine signature 
Phosphoglycerate mutase (EC 5.4.2.1) (PGAM) and bisphosphoglycerate mutase 
(EC 5.4.2.4) (BPGM) are structurally related enzymes which catalyze reactions 
involving the transfer of phospho groups between the three carbon atoms of 
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phosphoglycerate [1,2]. Both enzymes can catalyze three different reactions, 
ahhough in different proportions: 

- The isomerization of 2-phosphoglycerate (2-PGA) to 3-phosphoglycerate (3- 
PGA) with 2,3-diphosphoglycerate (2,3-DPG) as the primer of the reaction. 

5 - The synthesis of 2,3-DPG from 1,3-DPG with 3-PGA as a primer. 

- The degradation of 2,3-DPG to 3-PGA (phosphatase EC 3.1.3.13 activity). 

In mammals, PGAM is a dimeric protein. There are two isoforms of PGAM: the M 
(muscle) and B (brain) forms. In yeast, PGAM is a tetrameric protein. BPGM is 
a dimeric protein and is found mainly in erythrocytes where it plays a major 
10 role in regulating hemoglobin oxygen affinity as a consequence of controlling 
2,3-DPG concentration. 

The catalytic mechanism of both PGAM and BPGM involves the formation of a 
phosphohistidine intermediate [3]. 

The bifunctional enzyme 6-phosphofructo-2-kinase / fructose-2,6-bisphosphatase 
15 (EC 2.7.1.105 and EC 3.1.3.46) (PF2K) [4] catalyzes both the synthesis and the 
degradation of fructose-2,6-bisphosphate. PF2K is an important enzyme in the 
regulation of hepatic carbohydrate metabolism. Like PGAM/BPGM, the fructose- 
2,6-bisphosphatase reaction involves a phosphohistidine intermediate and the 
phosphatase domain of PF2K is structurally related to PGAM/BPGM. 

2 0 The bacterial enzyme alpha-ribazole-5'-phosphate phosphatase (gene cobC) which 

is involved in cobalamin biosynthesis also belongs to this family [5]. 
A signature pattern was built around the phosphohistidine residue. 

Consensus pattern: [LIVM]-x-R-H-G-[EQ]-x(3)-N [H is the phosphohistidine residue] 
25 -Note: some organisms harbor a form of PGAM independent of 2,3-DPG, this enzyme is 
not related to the family described above [6]. 

[ 1] Le Boulch P., Joulin V., Garel M.-C, Rosa J., Cohen-Solal M. Biochem. Biophys. Res. 
Commun. 156:874-881(1988). 

3 0 [2] White M.F., Fothergill-Gilmore L.A. FEES Lett. 229:383-387(1988). 

[ 3] Rose Z.B. Meth. Enzymol. 87:43-51(1982). 

[ 4] Bazan J.F., Fletterick R.J., Pilkis S.J. Proc. Natl. Acad. Sci. U.S.A. 86:9642- 
9646(1989). 
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[ 5] OToole G.A., Trzebiatowski J.R., Escalante-Semerena J.C. J. Biol. Chem. 269:26503- 
26511(1994). 

[ 6] Grana X., De Lecea L., El-Maghrabi M.R., Urena J.M., Caellas C, Carreras J., 
Puigdomenech P., Pilkis S.J., Climent F. J. Biol. Chem. 267:12797-12803(1992). 

407. (PGI) Phosphoglucose isomerase signatures 

Phosphoglucose isomerase (EC 5.3.1.9) (PGI) [1,2] is a dimeric enzyme that 
catalyzes the reversible isomerization of glucose-6-phosphate and fructose-6- 
phosphate. PGI is involved in different pathways: in most higher organisms it 
is involved in glycolysis; in mammals it is involved in gluconeogenesis; in 
plants in carbohydrate biosynthesis; in some bacteria it provides a gateway 
for fructose into the Entner-Doudouroff pathway. PGI has been shown [3] to be 
identical to neuroleukin, a neurotrophic factor which supports the survival of 
various types of neurons. 

The sequence of PGI from many species ranging from bacteria to mammals is 
available and has been shown to be highly conserved. As signature patterns for 
this enzyme two conserved regions were selected, the first region is located in 
the central section of PGI, while the second one is located in its C-terminal 
section. 

Consensus pattern: [DENS]-x-[LIVM]-G-G-R-[FY]-S-[LIVMT]-x-[STA]-[PSAC]- 
[LIVMA]-G 

-Consensus pattern: [GS]-x-[LIVM]-[LIVMFYW]-x(4)-[FY]-[DN]-0-x-G-V-E-x(2)-K 

[ 1] Achari A., Marshall S.E., Muirhewad H., Palmieri R.H., Noltmann E.A. Philos. Trans. 

R. Soc. Lond., B, Biol. Sci. 293:145-157(1981). 

[ 2] Smith M.W., Doolittle R.F. J. Mol. Evol. 34:544-545(1992). 

[ 3] Faik P., Walker J.I.H., Redmill A.A.M., Morgan M.J. Nature 332:455-456(1988). 

408. (PGK) Phosphoglycerate kinase signature 

Phosphoglycerate kinase (EC 2.7.2.3) (PGK) [1] catalyzes the second step in 
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the second phase of glycolysis, the reversible conversion of 1,3-diphospho- 

glycerate to 3-phosphoglycerate with generation of one molecule of ATP. PGK 

is found in all living organisms and its sequence has been highly conserved 

throughout evolution. It is a two-domain protein; each domain is composed of 

six repeats of an alpha/beta structural motif. As a signature pattern for 

PGK's, a conserved region in the N-terminal region was selected. 

Consensus pattern: [KRHGTCVN]-[VT]-[LIVMF]-[LIVMC]-R-x-D-x-N-[SACV]-P 

[ 1] Watson H.C., Littlechild J.A. Biochem. Soc. Trans. 18:187-190(1990). 

409. (PGM PMM) Phosphoglucomutase and phosphomannomutase phosphoserine signature 

- Phosphoglucomutase (EC 5.4.2.2) (PGM). PGM is an enzyme responsible for 
the conversion of D-glucose 1 -phosphate into D-glucose 6-phosphate. PGM 
participates in both the breakdown and synthesis of glucose [1], 

- Phosphomannomutase (EC 5.4.2.8) (PMM). PMM is an enzyme responsible for 
the conversion of D-mannose 1-phosphate into D-mannose 6-phosphate. PMM is 
required for different biosynthetic pathways in bacteria. For example, in 
enterobacteria such as Escherichia coli there are two different genes 

coding for this enzyme: rfbK which is involved in the synthesis of the O 
antigen of lipopolysaccharide and cpsG which is required for the synthesis 
of the M antigen capsular polysaccharide [2]. In Pseudomonas aeruginosa PMM 
(gene algC) is involved in the biosynthesis of the alginate layer [3] and 
in Xanthomonas campestris (gene xanA) it is involved in the biosynthesis of 
xanthan [4]. In Rhizobium strain ngr234 (gene noeK) it is involved in the 
biosynthesis of the nod factor. 

- Phosphoacetylglucosamine mutase (EC 5.4.2.3) which converts N-acetyl-D- 
glucosamine 1-phosphate into the 6-phosphate isomer. 

The catalytic mechanism of both PGM and PMM involves the formation of a 
phosphoserine intermediate [1]. The sequence around the serine residue is well 
conserved and can be used as a signature pattern. 

In addition to PGM and PMM there are at least three uncharacterized proteins 
that belong to this family [5,6]: 
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- Urease operon protein ureC from Helicobacter pylori. 

- Escherichia coH protein mrsA. 

- Paramecium tetraurelia parafusin, a phosphoglycoprotein involved in 
exocytosis. 

5 - A Methanococcus vannielii hypothetical protein in the 3'region of the gene 
for ribosomal protein SIO. 

Consensus pattern: [GSA]-[LIVM]-x-[LIVM]-[ST]-[PGA]-S-H-x-P-x(4)-[GNHE] [S is the 
phosphoserine residue] 
1 0 -Note: PMM from fungi do not belong to this family. 

[ 1] Dai J.B., Liu Y., Ray W.J. Jr., Konno M. J. Biol. Chem. 267:6322-6337(1992). 
[ 2] Stevenson G., Lee S.J., Romana L.K., Reeves P.R. Mol. Gen. Genet. 227:173- 
180(1991). 

15 [3] Zielinski N.A., Chakrabarty A.M., Berry A. J. Biol. Chem. 266:9754-9763(1991). 
[ 4] Koeplin R., Arnold W., Hoette B., Simon R., Wang G., Puehler A. J. Bacteriol. 
174:191-199(1992). 

[ 5] Bairoch A. Unpublished observations (1993). 

[ 6] Subramanian S.V., Wyroba E., Andersen A.P., Satir B.H. Proc. Natl. Acad. Sci. U.S.A. 
: 20 91:9832-9836(1994). 

410. PH domain profile 

The 'pleckstrin homology' (PH) domain is a domain of about 100 residues that 
25 occurs in a wide range of proteins involved in intracellular signaling or as 
constituents of the cytoskeleton [1 to 7]. 

The function of this domain is not clear, several putative functions have been 
suggested: - binding to the beta/gamma subunit of heterotrimeric G proteins, 

- binding to lipids, e.g. phosphatidylinositol-4,5-bisphosphate, 
30 - binding to phosphorylated Ser/Thr residues. 

- attachment to membranes by an unknown mechanism. 

It is possible that different PH domains have totally different ligand 
requirements. 
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The 3D structure of several PH domains has been determined [8]. All known 
cases have a common structure consisting of two perpendicular anti-parallel 
beta sheets, followed by a C-terminal amphipathic helix. The loops connecting 
the beta-strands differ greatly in length, making the PH domain relatively 
difficult to detect. There are no totally invariant residues within the PH 
domain. 

Proteins reported to contain one more PH domains belong to the following 
families: 

- Pleckstrin, the protein where this domain was first detected, is the major 
substrate of protein kinase C in platelets. Pleckstrin is one of the rare 
proteins to contains two PH domains. 

- Ser/Thr protein kinases such as the Act/Rac family, the beta-adrenergic 
receptor kinases, the mu isoform of PKC and the trypanosomal NrkA family. 

- Tyrosine protein kinases belonging to the Btk/Itk/Tec subfamily. 

- Insulin Receptor Substrate 1 (IRS-1). 

- Regulators of small G-proteins like guanine nucleotide releasing factor 
GNRP (Ras-GRF) (which contains 2 PH domains), guanine nucleotide exchange 
proteins like vav, dbl, SoS and yeast CDC24, GTPase activating proteins 

like rasGAP and BEM2/IPL2, and the human break point cluster protein bcr. 

- Cytoskeletal proteins such as dynamin (see <PDOC00362>), Caenorhabditis 
elegans kinesin-like protein unc-104 (see <PDOC00343>), spectrin beta- 
chain, syntrophin (2 PH domains) and yeast nuclear migration protein NUMl. 

- Mammalian phosphatidylinositol-specific phospholipase C (PI-PLC) (see 
<PDOC50007>) isoforms gamma and delta. Isoform gamma contains two PH 
domains, the second one is split into two parts separated by about 400 
residues. - Oxysterol binding proteins OSBP, yeast OSHl and YHR073w. 

- Mouse protein citron, a putative rho/rac effector that binds to the GTP- 
bound forms of rho and rac, 

- Several yeast proteins involved in cell cycle regulation and bud formation 
likeBEM2, BEM3, BUD4 and the BEMl -binding proteins BOI2 (BEBl) and BOIl 
(BOBl). - Caenorhabditis elegans protein MIG-10. 

- Caenorhabditis elegans hypothetical proteins C04D8.1, K06H7.4 and ZK632.12. 

- Yeast hypothetical proteins YBR129c and YHR155w. 
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The profile for the PH domain, which has been developed by Toby Gibson at the 
EMBL, covers the total length of domain. Several proteins contain large 
insertions in the PH domain and are thus difficult to detect with this 
profile. In some of these cases, the profile will align only to one half of 
the PH domain. 

-Sequences known to belong to this class detected by the pattern: ALL. But it 
should be noted that while all sequences containing PH domains are detected, 
not all PH domains are. Some of the split domains lie below the cutoff 
threshold. 

[ 1] Mayer B.J., Ren R., Clark K.L., Baltimore D. Cell 73:629-630(1993). 
[ 2] Haslam R.J., Koide H.B., Hemmings B.A. Nature 363:309-310(1993). 
[ 3] Musacchio A., Gibson T.J., Rice P., Thompson J., Saraste M. 

Trends Biochem. Sci. 18:343-348(1993). 
[ 4] Gibson T.J., Hyvonen M., Musacchio A., Saraste M., Birney E. 

Trends Biochem. Sci. 19:349-353(1994).[ 5] Pawson T. 

Nature 373:573-580(1995).[ 6] Ingley E., Hemmings B.A. 

J. Cell. Biochem. 56:436-443(1994). [ 7] Saraste M., Hyvonen M. 

Curr. Opin. Struct. Biol. 5:403-408(1995).[ 8] Riddihough G. 

Nat. Struct. Biol. 1:755-757(1994). 

411. PHD-finger 
[1] 

Medline: 95216093 

The PHD finger: implications for chromatin-mediated 
transcriptional regulation. 
Aasland R, Gibson TJ, Stewart AF; 
Trends Biochem Sci 1995;20:56-59. 
Number of members: 181 

412. (PI-PLC-X) Phosphatidylinositol-specific phospholipase C profiles 
Phosphatidylinositol-specificphospholipase C (EC 3.1.4.11), an eukaryotic 
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intracellular enzyme, plays an important role in signal transduction processes 
[1]. It catalyzes the hydrolysis of l-phosphatidyl-D-myo-inositol-3,4,5- 
triphosphate into the second messenger molecules diacylglycerol and inositol- 
1,4,5-triphosphate. This catalytic process is tightly regulated by reversible 
phosphorylation and binding of regulatory proteins [2 to 4]. 
In mammals, there are at least 6 different isoforms of PI-PLC, they differ in 
their domain structure, their regulation, and their tissue distribution. Lower 
eukaryotes also possess multiple isoforms of PI-PLC. 

All eukaryotic PI-PLCs contain two regions of homology, sometimes referred to 
as 'X-box' and 'Y-box'. The order of these two regions is always the same 
(NH2-X-Y-C00H), but the spacing is variable. In most isoforms, the distance 
between these two regions is only 50-100 residues but in the gamma isoforms 
one PH domain, two SH2 domains, and one SH3 domain are inserted between the 
two PLC-specific domains. The two conserved regions have been shown to be 
important for the catalytic activity. At the C-terminal of the Y-box, there is 
a C2 domain (see <PDOC00380>) possibly involved in Ca-dependent membrane 
attachment. 

Profile analysis shows that sequences with significant similarity 
to the X-box domain occur also in prokaryotic and trypanosome Pl-specific 
phospholipases C. Apart from this region, the prokaryotic enzymes show no 
similarity to their eukaryotic counterparts. 

Two profiles were developed, one covering the X-box, the other the Y-box. 
[ 1] Meldrum E., Parker P.J., Carozzi A. 

Biochim. Biophys. Acta 1092:49-71(1991).[ 2] Rhee S.G., Choi K.D. 

Adv. Second Messenger Phosphoprotein Res. 26:35-61(1992). 
[ 3] Rhee S.G., Choi K.D. J. Biol. Chem. 267:12393-12396(1992). 
[ 4] Sternweis P.C., Smrcka A.V. Trends Biochem. Sci. 17:502-506(1992). 

413. (PI-PLC-Y) Phosphatidylinositol-specific phospholipase C profiles 
Phosphatidylinositol-specific phospholipase C (EC 3.1.4.11), an eukaryotic 
intracellular enzyme, plays an important role in signal transduction processes 
[1]. It catalyzes the hydrolysis of 1 -phosphatidyl -D-myo-inositol-3,4,5- 
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triphosphate into the second messenger molecules diacylglycerol and inositol- 
1,4,5-triphosphate. This catalytic process is tightly regulated by reversible 
phosphorylation and binding of regulatory proteins [2 to 4]. 
In mammals, there are at least 6 different isoforms of PI-PLC, they differ in 
5 their domain structure, their regulation, and their tissue distribution. Lower 
eukaryotes also possess multiple isoforms of PI-PLC. 

All eukaryotic PI-PLCs contain two regions of homology, sometimes referred to 
as 'X-box' and 'Y-box'. The order of these two regions is always the same 
(NH2-X-Y-COOH), but the spacing is variable. In most isoforms, the distance 

10 between these two regions is only 50-100 residues but in the gamma isoforms 

one PH domain, two SH2 domains, and one SH3 domain are inserted between the 
two PLC-specific domains. The two conserved regions have been shown to be 
important for the catalytic activity. At the C-terminal of the Y-box, there is 
a C2 domain (see <PDOC00380>) possibly involved in Ca-dependent membrane 

1 5 attachment. 

Profile analysis shows that sequences with significant similarity 
to the X-box domain occur also in prokaryotic and trypanosome Pl-specific 
phospholipases C. Apart from this region, the prokaryotic enzymes show no 
similarity to their eukaryotic counterparts. 

20 Two profiles were developed, one covering the X-box, the other the Y-box. 
[ 1] Meldrum E., Parker P.J., Carozzi A. 

Biochim. Biophys. Acta 1092:49-71(1991).[ 2] Rhee S.G., Choi K.D. 
Adv. Second Messenger Phosphoprotein Res. 26:35-61(1992). 
[ 3] Rhee S.G., Choi K.D. J. Biol. Chem. 267:12393-12396(1992). 
25 [4] Sternweis P.C, Smrcka A.V. Trends Biochem. Sci. 17:502-506(1992). 

414. (PK) Pyruvate kinase active site signature 

Pyruvate kinase (EC 2.7.1.40) (PK) [1] catalyzes the final step in glycolysis, 
3 0 the conversion of phosphoenolpyruvate to pyruvate with the concomitant 

phosphorylation of ADP to ATP. PK requires both magnesium and potassium ions 
for its activity. PK is found in all living organisms. In vertebrates there 
are four, tissues specific, isozymes: L (liver), R (red cells), Ml (muscle. 
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heart, and brain), and M2 (early fetal tissues). In Escherichia coli there are 
two isozymes: PK-I (gene pykF) and PK-II (gene pykA). All PK isozymes seem to 
be tetramers of identical subunits of about 500 amino acid residues. 
As a signature pattern for PK a conserved region was selected that includes a 
5 lysine residue which seems to be the acid/base catalyst responsible for the 
interconversion of pyruvate and enolpyruvate, and a glutamic acid residue 
implicated in the binding of the magnesium ion. 

Consensus pattern: [LIVAC]-x-[LIVM](2)-[SAPCV]-K-[LIV]-E-[NKRST]-x-[DEQHS]- 
1 0 [GSTA]-[LIVM] [K is the active site residue] [E is a magnesium ligand] 

[ 1] Muirhead H. Biochem. Soc. Trans. 18:193-196(1990). 

1 5 415. (PLDc) Phospholipase D. Active site motif 

Phosphatidylcholine-hydrolyzing phospholipase D (PLD) isoforms are 
activated by ADP-ribosylation factors (ARFs). PLD produces phosphatidic 
acid from phosphatidylcholine, which may be essential for the formation 
of certain types of transport vesicles or may be constitutive vesicular 

2 0 transport to signal transduction pathways. 

PC-hydrolyzing PLD is a homologue of cardiolipin synthase, 
phosphatidylserine synthase, bacterial PLDs, and viral proteins. 
Each of these appears to possess a domain duplication which is apparent 
by the presence of two motifs containing well-conserved histidine, lysine, 

25 and/or asparagine residues which may contribute to the active site. 

aspartic acid. An E. coli endonuclease (nuc) and similar proteins appear 
to be PLD homologues but possess only one of these motifs. 
The profile contained here represents only the putative active site 
regions, since an accurate multiple alignment of the repeat units 

30 has not been achieved. 

Number of members: 139 
[1] 

Medline: 96303814 
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A novel family of phospholipase D homologues that includes 
phospholipid synthases and putative endonucleases: 
identification of duplicated repeats and potential active 
site residues. 
Ponting CP, Kerr ID; 
Protein Sci 1996;5:914-922. 
[2]Medline: 96334293 
A duplicated catalytic motif in a new superfamily of 
phosphohydrolases and phospholipid synthases that includes 
poxvirus envelope proteins. 
Koonin EV; 
Trends Biochem Sci 1996;21:242-243. 
[3]Medline: 94327597 
Cloning and expression of phosphatidylcholine-hydrolyzing 
phospholipase D from Ricinus communis L. 
Wang X, Xu L, Zheng L; 
J Biol Chem 1994;269:20312-20317. 
[4]Medline: 97386825 
Regulation of eukaryotic phosphatidylinositol-specific 
phospholipase C and phospholipase D. 
Singer WD, Brown HA, Sternweis PC; 
Annu Rev Biochem 1997;66:475-509. 

416. (PMI typel) Phosphomannose isomerase type I signatures 
Phosphomannose isomerase (EC 5.3.1.8) (PMI) [1,2] is the enzyme that catalyzf 
the interconversion of mannose-6-phosphate and fructose-6-phosphate. In 
eukaryotes, it is involved in the synthesis of GDP-mannose which is a 
constituent of N- and O-linked glycans as well as GPI anchors. In prokaryotes, 
it is involved in a variety of pathways including capsular polysaccharide 
biosynthesis and D-mannose metabolism. 

Three classes of PMI have been defined on the basis of sequence similarities 
[1]. The first class comprises all known eukaryotic PMI as well as the enzyme 
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encoded by the manA gene in enterobacteria such as Escherichia coli. Class I 
PMFs are proteins of about 42 to 50 Kd which bind a zinc ion essential for 
their activity. 

As signature patterns for class I PMI, two conserved regions were selected. The 
first one is located in the N-terminal section of these proteins, the second 
in the C-terminal half. Both patterns contain a residue involved [3] in the 
binding of the zinc ion. 

Consensus pattern: Y-x-D-x-N-H-K-P-E [E is a zinc ligand] 

-Consensus pattern: H-A-Y-[LIVM]-x-G-x(2)-[LIVM]-E-x-M-A-x-S-D-N-x-[LIVM]-R-A- 
G-x-T-P-K [H is a zinc ligand] 

[ 1] Proudfoot A.E.I., Turcatti G., Wells T.N.C., Payton M.A., Smith D.J. Eur. J. Biochem. 
219:415-423(1994). 

[ 2] Coulin F., Magnenat E., Proudfoot A.E.I., Payton M.A., Scully P., Wells T.N.C. 
Biochemistry 32:14139-14144(1993). 

[ 3] Cleasby A., Wonacott A., Skarzynski T., Hubbard R.E., Davies G.J., Proudfoot A.E.I., 
Bernard A.R., Payton M.A., Wells T.N.C. Nat. Struct. Biol. 3:470-479(1996). 

417. (PNP UDP 1) Purine and other phosphorylases family 1 signature 
The following phosphorylases belongs to the same family: 

- Purine nucleoside phosphorylase (EC 2.4.2.1) (PNP) from most bacteria 
(gene deoD). This enzyme catalyzes the cleavage of guanosine or inosine to 
respective bases and sugar- 1 -phosphate molecules [1]. 

- Uridine phosphorylase (EC 2.4.2.3) (UdRPase) from bacteria (gene udp) and 
mammals. Catalyzes the cleavage of uridine into uracil and ribose-1- 
phosphate. The products of the reaction are used either as carbon and 
energy sources or in the rescue of pyrimidine bases for nucleotide 
synthesis [2]. 

- 5'-methylthioadenosine phosphorylase (EC 2.4.2.28) (MTA phosphorylase) from 
Sulfolobus solfataricus [3]. 

As a signature pattern, a conserved region was selected in the central part of 
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these enzymes. 

Consensus pattern : [GST] -x-G-[LIVM] -G-x-[PA] -S-x- [GST A] -I-x(3)-E-L 

-Note: it shoudl be noted that mammalian and some bacterial PNP as well as eukaryotic 

MTA phosphorylase belong to a different family of phosphorylases (see <PDOC00954>). 

[ 1] Takehara M., Ling F., Izawa S., Inoue Y., Kimura A. Biosci. Biotechnol. Biochem. 
59:1987-1990(1995). 

[ 2] Watanabe S.-L, Hino A., Wada K., Eliason J.F., Uchida T. J. Biol. Chem. 270:12191- 
12196(1995). 

[ 3] Cacciapuoti G., Porcelli M., Bertoldo C, De Rosa M., Zappia V. J. Biol. Chem. 
269:24762-24769(1994). 

418. (PP2C) Protein phosphatase 2C signature 

Protein phosphatase 2C (PP2C) is one of the four major classes of mammalian 
serine/threonine specific protein phosphatases (EC 3.1.3.16). PP2C [1] is a 
monomeric enzyme of about 42 Kd which shows broad substrate specificity and 
is dependent on divalent cations (mainly manganese and magnesium) for its 
activity. Its exact physiological role is still unclear. Three isozymes are 
currently known in mammals: PP2C-alpha, -beta and -gamma. In yeast, there are 
at least four PP2C homologs: phosphatase PTCl [2] which has weak tyrosine 
phosphatase activity in addition to its activity on serines, phosphatases PTC2 
and PTC3, and hypothetical protein YBR125c. Isozymes of PP2C are also known 
from Arabidopsis thaliana (ABIl, PPHl), Caenorhabditis elegans (FEM-2, 
F42G9.1, T23F11.1), Leishmania chagasi and Paramecium tetraurelia. 
In Arabidopsis thaliana, the kinase associated protein phosphatase (BLAPP) [3] 
is an enzyme that dephosphorylates the Ser/Thr receptor-like kinase RLK5 and 
which contains a C-terminal PP2C domain. 

PP2C does not seem to be evolutionary related to the main family of serine/ 
threonine phosphatases: PPl, PP2A and PP2B . However, it is significantly 
similar to the catalytic subunit of pyruvate dehydrogenase phosphatase 
(EC 3.1.3.43) (PDPC) [4], which catalyzes dephosphorylation and concomitant 
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reactivation of the alpha subunit of the El component of the pyruvate 
dehydrogenase complex. PDPC is a mitochondrial enzyme and, like PP2C, is 
magnesium-dependent. 

As a signature pattern, the best conserved region was selected which is located 
in the N-terminal part and contains a perfectly conserved tripeptide. This 
region includes a conserved aspartate residue involved in divalent cation 
binding [5]. 

Consensus pattern: [LIVMFY]-[LIVMFYA]-[GSAC]-[LIVM]-[FYC]-D-G-H-[GAV] 
-Note: PP2C belongs [6] to a superfamily which also includes bacterial proteins such as 
Bacillus spoIIE, rsbU and rsbW, Synechocystis PCC 6803 icfO as well as a domain in fungal 
adenylate cyclases. 

[ 1] Wenk J., Trompeter H.-I., Pettrich K.-G., Cohen P.T.W., Campbell D.G., Mieskes G. 
FEES Lett. 297:135-138(1992). 

[ 2] Maeda T., Tsai A.Y.M., Saito H. Mol. Cell. Biol. 13:5408-5417(1993). 

[ 3] Stone J.M., Collinge M.A., Smith R.D., Horn M.A., Walker J.C. Science 266:793- 

795(1994). 

[ 4] Lawson J.E., Niu X.-D., Browning K.S., Trong H.L., Yan J., Reed L.J. Biochemistry 
32:8987-8993(1993). 

[ 5] Das A.K., Helps N.R., Cohen P.T.W., Barford D. EMBO J. 24:6798-6809(1996). 
[ 6] Bork P., Brown N.P., Hegyi H., Schultz J. Protein Sci. 5:1421-1425(1996). 

419. (PPTA) Protein prenyltransferases alpha subunit repeat signature 
Protein prenyltransferases catalyze the transfer of an isoprenyl moiety to a 
cysteine four residues from the C-terminus of several proteins. They are 
heterodimeric enzymes consisting of alpha and beta subunits. The alpha subunit 
is thought to participate in a stable complex with the isoprenyl substrate; 
the beta subunit binds the peptide substrate. Distinct protein 
prenyltransferases might share a common alpha subunit. Both the alpha and 
beta subunit show repetitive sequence motifs [1]. These repeats have distinct 
structural and functional implications and are unrelated to each other. Known 
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protein prenyltransferase alpha subunits are: 

- Mammalian protein farnesyltransferase alpha subunit. 

- Yeast protein RAM2, a protein farnesyltransferase alpha subunit. 

- Yeast protein BET4, a protein geranylgeranyltransferase alpha subunit. 

5 The conserved domain of the alpha subunit consists of about 34 amino acids and 
is repeated five times. It contains an invariant tryptophan possibly involved 
in heterodimerization with the conserved phenylalanines in the repeated 
domains of the beta subunits, via hydrophobic bonds. The signature pattern for 
this domain is centered on the invariant tryptophan. 

10 

Consensus pattern: [PSIAV]-x-[NDFV]-[NEQIY]-x-[LIVMAGP]-W-[NQSTHF]-[FYHQ]- 
[LIVMR] 

[ 1] Boguski M.S., Murray A.W., Powers S. New Biol. 4:408-411(1992). 

15 

420. (PR55) Protein phosphatase 2A regulatory subunit PR55 signatures 
Protein phosphatase 2 A (PP2A) is a serine/threonine phosphatase involved in 
many aspects of cellular function including the regulation of metabolic 

2 0 enzymes and proteins involved in signal transduction. PP2A is a trimeric 

enzyme that consists of a core composed of a catalytic subunit associated with 
a 65 Kd regulatory subunit (PR65), also called subunit A; this complex then 
associates with a third variable subunit (subunit B), which confers distinct 
properties to the holoenzyme [1]. One of the forms of the variable subunit is 
25 a55 Kd protein (PR55) which is highly conserved in mammals - where three 
isoforms are known to exist -, Drosophila and yeast (gene CDC55). This subunit 
could perform a substrate recognition function or be responsible for targeting 
the enzyme complex to the appropriate subcellular compartment. 
As signature patterns, two perfectly conserved sequences of 15 

3 0 residues were selected; one located in the N-terminal region, the other in the center of 

the protein. 

Consensus pattern: E-F-D-Y-L-K-S-L-E-I-E-E-K-I-N 
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Consensus pattern: N-[AG]-H-[TA]-Y-H-I-N-S-I-S-[LIVM]-N-S-D 

[ 1] Mayer- Jaekel R., Hemmings B.A. Trends Cell Biol. 4:287-291(1994). 

5 

421. N-(5'phosphoribosyl)anthranilate (PRA) isomerase 
[1] Wilmanns M, Priestle JP, Niermann T, Jansonius JN; 

J Mol Biol 1992;223:477-507. 

10 

422. (PRK) Phosphoribulokinase signature 

Phosphoribulokinase (EC 2.7.1.19) (PRK) [1,2] is one of the enzymes specific 
to the Calvin's reductive pentose phosphate cycle which is the major route by 
which carbon dioxide is assimilated and reduced by autotrophic organisms. PRK 

15 catalyzes the ATP-dependent phosphorylation of ribulose 5 -phosphate into 
ribulose 1,5-bisphosphate which is the substrate for RubisCO. 
PRK's of diverse origins show different properties with respect to the size of 
the protein^ the subunit structure, or the enzymatic regulation. However an 
alignment of the sequences of PRK from plants, algae, photosynthetic and 

2 0 chemoautotrophic bacteria shows that there are a few regions of sequence 
similarity. As a signature pattern one of these regions was selected. 

Consensus pattern: K-[LIVM]-x-R-D-x(3)-R-G-x-[ST]-x-E 

2 5 [1] Kossmann J., Klintworth R., Bowien B. Gene 85:247-252(1989). 

[ 2] Gibson J.L., Chen J.-H., Tower P.A., Tabita F.R. Biochemistry 29:8085-8093(1990). 

423. (PRPP synt) Phosphoribosyl pyrophosphate synthetase signature 

3 0 Phosphoribosyl pyrophosphate synthetase (EC 2.7.6.1) (PRPP synthetase) 

catalyzes the formation of PRPP from ATP and ribose 5-phosphate. PRPP is then 
used in various biosynthetic pathways, as for example in the formation of 
purines, pyrimidines, histidine and tryptophan. PRPP synthetase requires 
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inorganic phosphate and magnesium ions for its stability and activity. 

In mammals, three isozymes of PRPP synthetase are found; in yeast there are at 

least four isozymes. 

As a signature pattern for this enzyme, a very conserved region v/as selected 
5 that has been suggested to be involved in binding divalent cations [1]. This 
region contains two conserved aspartic acid residues as well as a histidine, 
which are all potential ligands for a cation such as magnesium. 

Consensus pattern: D-[LI]-H-[SA]-x-Q-[IMST]-[QM]-G-[FY]-F-x(2)-P-[LIVMFC]-D 

10 

[ 1] Bower S.G., Harlow K.W., Switzer R.L., Hoven-Jensen B. J. Biol. Chem. 264:10287- 
10291(1989). 

15 424. (PRTP) Herpesvirus processing and transport protein 

The members of this family are associate with capsid intermediates during packaging of the 
virus. 

Number of members: 31 
[1] 

2 0 Medline: 98362148 

Herpes simplex virus type 1 cleavage and packaging proteins 
UL15 and UL28 are associated with B but not C capsids during 
packaging. Yu D, Weller SK; 
J Virol 1998;72:7428-7439. 

25 

425. Photosystem I psaG / psaK (PS I PSAK) proteins signature 

Photosystem I (PSI) [1] is an integral membrane protein complex that uses light energy to 
mediate electron transfer from plastocyanin to ferredoxin. It is found in the chloroplasts of 
30 plants and cyanobacteria. PSI is composed of at least 14 different subunits, two of which PSI- 
G (gene psaG) and PSI-K (gene psaK) are small hydrophobic proteins of about 7 to 9 Kd and 
evolutionary related [2]. Both seem to contain two transmembrane regions. Cyanobacteria 
seem to encode only for PSI-K. 
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As a signature pattern, the best-conserved region was selected wliich seems to 
correspond to the second transmembrane region. 

-Consensus pattern: [GT]-F-x-[LIVM]-x-[DEA]-x(2)-[GA]-x-[GTA]-[SA]-x-G-H-x-[LIVM]- 
5 [GA] 

[1] Golbeck J.H. Biochim. Biophys. Acta 895:167-204(1987). 

[2] Kjaerulff S., Andersen B., Nielsen V.S., Moller B.L., Okkels J.S. J. Biol. Chem. 

268:18912-18916(1993). 



426. PTR2 family proton/oligopeptide symporters signatures 
A family of eukaryotic and prokaryotic proteins that seem to be mainly 
involved in the intake of small peptides with the concomitant uptake of a 
15 proton has been recently characterized [1,2]. Proteins that belong to this 
family are: - Fungal peptide transporter PTR2. 

- Mammalian intestine proton-dependent oligopeptide transporter PeptTl. 

- Mammalian kidney proton-dependent oligopeptide transporter PeptT2. 

- Drosophila optl. 

20 - Arabidopsis thaliana peptide transporters PTR2-A and PTR2-B (also known as 
the histidine transporting protein NTRl). 

- Arabidopsis thaliana proton-dependent nitrate/chlorate transporter CHLl. 

- Lactococcus proton-dependent di- and tri-peptide transporter dtpT. 

- Caenorhabditis elegans hypothetical protein C06G8.2. 
25 - Caenorhabditis elegans hypothetical protein F56F4.5. 

- Caenorhabditis elegans hypothetical protein K04E7.2. 

- Escherichia coli hypothetical protein ybgH. 

- Escherichia coli hypothetical protein ydgR. 

- Escherichia coli hypothetical protein yhiP. 
3 0 - Escherichia coli hypothetical protein yjdL. 

- Bacillus subtilis hypothetical protein yclF. 

These integral membrane proteins are predicted to comprise twelve 
transmembrane regions. As signature patterns, two of the best 
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conserved regions were selected. The first is a region that includes the end of the second 
transmembrane region, a cytoplasmic loop as well as the third transmembrane 
region. The second pattern corresponds to the core of the fifth transmembrane 
region. 

5 

-Consensus pattern: [GA]-[GAS]-[LIVMFYWA]-[LIVM]-[GAS]-D-x-[LIVMFYWT]- 

[LIVMFYW]-G-x(3)-[TAV]-[IV]-x(3)-[GSTAV]-x-[LIVMF]-x(3)-[GA] 

-Consensus pattern: [FYT]-x(2)-[LMFY]-[FYV]-[LIVMFYWA]-x-[IVG]-N-[LIVMAG]-G- 

[GSA]-[LIMF] 

10 

[ 1] Paulsen I.T., Skurray R.A. Trends Biochem. Sci. 19:404-404(1994). 

[ 2] Steiner H.-Y., Naider F., Becker J.M. Mol. Microbiol. 16:825-834(1995). 

1 5 427. Pumilio-family RNA binding domains (aka PUM-HD, Pumilio homology domain) 

Puf domains are necessary and sufficient for sequence specific 
RNA binding in fly Pumilio and worm FBF-1 and FBF-2. Both proteins 
function as translational repressors in early embryonic development 
2 0 by binding sequences in the 3' UTR of target mRNAs (e.g. the 

nanos response element (NRE) in fly Hunchback mRNA, or the point 

mutation element (PME) in worm fem-3 mRNA). Other proteins that contain Puf domains are 
also plausible RNA binding proteins. JSN1_YEAST, for instance, appears to also contain a 
single RRM domain by HMM analysis. 
2 5 Puf domains usually occur as a tandem repeat of 8 domains. 

The Pfam model does not necessarily recognize all 8 domains in 
all sequences; some sequences appear to have 5 or 6 domains on 
initial analysis, but further analysis suggests the presence 
of additional divergent domains. 

30 

[1] Zhang B, Gallegos M, Puoti A, Durkin E, Fields S, Kimble J, 

Wickens MP. Nature 1997;390:477-484. [2] Zamore PD, Williamson JR, Lehmann R. 

RNA 1997;3:1421-1433. 
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428. PWWP domain. The PWWP domain is named after a conserved Pro-Trp-Trp-Pro motif. 
The function of the domain is currently unknown. Number of members: 19 

5 

[1] Medline: 98282232. WHSCl, a 90 kb SET domain-containing gene, expressed in early 
development and homologous to a Drosophila dysmorphy gene maps in the Wolf-Hirschhorn 
syndrome critical region and is fused to IgH in t(4;14) multiple myeloma. Stec I, Wright TJ, 
van Ommen GJB, de Boer PAJ, van Haeringen A, Moorman AFM, Altherr MR, den Dunnen 
1 0 JT; Hum Mol Genet 1998;7:1071-1082. 

429. PX domain 

Eukaryotic domain of unknown function present in phox proteins, PLD isoforms, a PI3K 
15 isoform. 

Number of members: 71 
[1] 

Medline: 97084820 

Novel domains in NADPH oxidase subunits, sorting nexins, and 
2 0 Ptdlns 3 -kinases: binding partners of SH3 domains? 
Pouting CP; 
Protein Sci 1996;5:2353-2357. 

2 5 430. ParA family ATPase 

[1] 

Medline: 91141297 

A family of ATPases involved in active partitioning of 
diverse bacterial plasmids. 

3 0 Motallebi-Veshareh M, Rouch DA, Thomas CM; 

Mol Microbiol 1990;4:1455-1463. 
Number of members: 122 
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431. (Parvo coat) Parvovirus coat protein. 72 members. 

5 432. Pectinesterase signatures 

Pectinesterase (EC 3.1.1.11) (pectin methylesterase) catalyzes the hydrolysis 
of pectin into pectate and methanol. In plants, it plays an important role in 
cell wall metabolism during fruit ripening. In plant bacterial pathogens such 
as Erwinia carotovora and in fungal pathogens such as Aspergillus niger, 

1 0 pectinesterase is involved in maceration and soft-rotting of plant tissue. 

Prokaryotic and eukaryotic pectinesterases share a few regions of sequence 
similarity [1,2.3]. two of these regions were selected as signature patterns. 
The first is based on a region in the N-terminal section of these enzymes; it 
contains a conserved tyrosine which may play a role in the catalytic mechanism 

15 [3]. The second pattern corresponds to the best conserved region, an 
octapeptide located in the central part of these enzymes. 

-Consensus pattern: [GSTNP]-x(6)-[FYVHR]-[IVN]-[KEP]-x-G-[STIVKRQ]-Y- 
[DNQKRMV]-[EP]-x(3)-[LIMVA] 

2 0 -Consensus pattern: [IV]-x-G-[STAD]-[LIVT]-D-[FYI]-[IV]-[FSN]-G 

[ 1] Ray J., Knapp J., Grierson D., Bird C, Schuch W. Eur. J. Biochem. 174:119-124(1988). 

[ 2] Plastow G.S. Mol. Microbiol. 2:247-254(1988). 

[ 3] Markovic O., Joernvall H. Protein Sci. 1:1288-1292(1992). 

25 

433. Pentapeptide repeats (8 copies) 
These repeats are found in many cyanobacterial proteins. 
The repeats were first identified in hglK [1]. The function of 

3 0 these repeats is unknown. 

The structure of this repeat has been predicted to be a 
beta-helix [2]. 

The repeat can be approximately described as A(D/N)LXX, where 
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X can be any amino acid. Number of members: 75 
[1] 

Medline: 96062225 

The hglK gene is required for localization of 
5 heterocyst-specific glycolipids in the cyanobacterium 
Anabaena sp. strain PCC 7120. 
Black K, Buikema WJ, Haselkorn R; 
J Bacteriol 1995;177:6440-6448. 
[2]Med]ine: 98318059 
1 0 Structure and distribution of pentapeptide repeats in 
bacteria. 

Bateman A, Murzin A, Teichmann SA; 
Protein Sci 1998;7:1477-1480. 
[3]Medline: 98316713 
1 5 Characterisation of an Arabidopsis cDNA encoding a thylakoid 
lumen protein related to a novel 'pentapeptide repeat' family 
of proteins. 

Kieselbach T, Mant A, Robinson C, Schroder WP; 
FEES Lett 1998;428:241-244. 

20 

434. Polypeptide deformylase 
[1] 

Medline: 97002011 

2 5 A new subclass of the zinc metalloproteases superfamily 

revealed by the solution structure of peptide deformylase. 
Meinnel T, Blanquet S, Dardel F; 
J Mol Biol 1996;262:375-386. 
[2]Medline: 98332750 

3 0 Solution structure of nickel-peptide deformylase. 

Dardel F, Ragusa S, Lazennec C, Blanquet S, Meinnel T; 
J Mol Biol 1998;280:501-513. 
Number of members: 21 
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435. Peptidyl-tRNA hydrolase signatures 

Peptidyl-tRNA hydrolase (EC 3.1.1.29) (PTH) is a bacterial enzyme that cleaves 
5 peptidyl-tRNA or N-acyl-aminoacyl-tRNA to yield free peptides or N-acyl-amino 
acids and tRNA. The natural substrate for this enzyme may be peptidyl-tRNA 
which drop off the ribosome during protein synthesis [1,2]. Bacterial PTH has 
been found [2,3] to be evolutionary related to yeast hypothetical protein 
YHR189W. 

10 PTH and YHR189w are proteins of about 200 amino acid residues. As signature 
patterns, two conserved regions were selected that each contain an histidine. 
The first of these regions is located in the N-terminal section, the other in 
the central part. 

15 -Consensus pattern: [FY]-x(2)-T-R-H-N-x-G-x(2)-[LIVMFA](2)-[DE] 
-Consensus pattern: [GS]-x(3)-H-N-G-[LIVM]-[KR]-[DNS]-[LIVMT] 

[ 1] Garcia-Villegas M.R., De La Vega P.M., Galindo J.M., Segura M., Buckingham R.H,, 
Guarneros G. EMBO J. 10:3549-3555(1991). 
2 0 [ 2] De La Vega P.M., Galindo J.M., Old I.G., Guarneros G. Gene 169:97-100(1996). 
[ 3] Ouzounis C, Bork P., Casari G., Sander C. Protein Sci. 4:2424-2428(1995). 

436. (Peptidase Ml?) Cytosol aminopeptidase signature 

2 5 Cytosol aminopeptidase is a eukaryotic cytosolic zinc-dependent exopeptidase 

that catalyzes the removal of unsubstituted amino-acid residues from the 
N-terminus of proteins. This enzyme is often known as leucine aminopeptidase 
(EC 3.4.11.1) (LAP) but has been shown [1] to be identical with prolyl 
aminopeptidase (EC 3.4.11.5). Cytosol aminopeptidase is a hexamer of identical 

3 0 chains, each of which binds two zinc ions. 

Cytosol aminopeptidase is highly similar to Escherichia coli pepA, a manganese 
dependent aminopeptidase. Residues involved in zinc ion-binding [2] in the 
mammalian enzyme are absolutely conserved in pepA where they presumably bind 
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manganese. 

A cytosol aminopeptidase from Rickettsia prowazekii [3] and one from 
Arabidopsis thaliana also belong to this family. 
As a signature pattern for these enzymes, a perfectly conserved 
5 octapeptide was selected which contains two residues involved in binding metal ions: an 
aspartate and a glutamate. 

-Consensus pattern: N-T-D-A-E-G-R-L [The D and the E are zinc/manganese ligands] 
-Note: these proteins belong to family M17 in the ciassification of peptidases [4,E1]. 

10 

[ 1] Matsushima M., Takahashi T., Ichinose M., Miki K., Kurokawa K., Takahashi K. 
Biochem. Biophys. Res. Commun. 178:1459-1464(1991). 

[ 2] Burley S.K., David P.R., Sweet R.M., Taylor A., Lipscomb W.N. J. Mol. Biol. 224:113- 
140(1992). 

15 [3] Wood D.O., Solomon M.J., Speed R.R. J. Bacteriol. 175:159-165(1993). 
[ 4] Rawlings N.D., Barrett A.J. Meth. Enzymol. 248:183-228(1995). 

437. Assemblin (Peptidase family S21) 
20 [1] 

Medline: 96399137 

Three-dimensional structure of human cytomegalovirus 
protease. 

Shieh HS, Kurumbail RG, Stevens AM, Stegeman RA, Sturman EJ, 
2 5 Pak JY, Wittwer AJ, Palmier MO, Wiegand RC, Holwerda BC, 
Stallings WC; 
Nature 1996;383:279-282. 
Number of members: 29 

30 

438. Pollen proteins Ole e I family signature 

The following plant pollen proteins, whose biological function is not yet 
known, are structurally related [1]: 
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- Olive tree pollen major allergen (Ole e I). 

- Tomato anther-specific protein LAT52. - Maize pollen-specific protein ZmClS. 
These proteins are most probably secreted and consist of about 145 residues. 

As shown in the following schematic representation, there are six cysteines 
5 which are conserved in the sequence of these proteins. They seem to be 
involved in disulfide bonds. 

xxxxxxCxCxxxxxxxxxCxxxxxxxxxxxxxxxxxCxxxxxCxxxxxxxxxxxxxxxxxxxxCxxxxxxx 
******'C': conserved cysteine involved in a disulfide bond, 
position of the pattern. 

10 

-Consensus pattern: [EQ]-G-x-V-Y-C-D-T-C-R [The two C's are probably involved in 
disulfide bonds] 

[ 1] Villalba M., Batanero E., Lopez-Otin C, Sanchez L.M., Monsalve R.I., Gonzalez De La 
15 Pena M.A., Lahoz C, Rodriguez R. Eur. J. Biochem. 216:863-869(1993). 

439. Pollen allergen 

This family contains allergens lol PI, PII and PIII from Lolium perenne. 
2 0 Number of members: 49 
[1] 

Medline: 90105394 

Complete primary structure of a Lolium perenne (perennial rye 
grass) pollen allergen, Lol p III: comparison with known Lol 
2 5 pi and II sequences. 

Ansari AA, Shenbagamurthi P, Marsh DG; 
Biochemistry 1989;28:8665-8670. 

30 440. Porphobilinogen deaminase cofactor-binding site 

Porphobilinogen deaminase (EC 4.3.1.8), or hydroxymethylbilane synthase, is an 
enzyme involved in the biosynthesis of porphyrins and related macrocycles. It 
catalyzes the assembly of four porphobilinogen (PBG) units in a head to tail 
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fashion to form hydroxymethylbilane. 

The enzyme covalently binds a dipyrromethane cofactor to which the PBG 
subunits are added in a stepwise fashion. In the Escherichia coli enzyme (gene 
hemC), this cofactor has been shown [1] to be bound by the sulfur atom of a 
5 cysteine. The region around this cysteine is conserved in porphobilinogen 
deaminases from various prokaryotic and eukaryotic sources. 

-Consensus pattern: E-R-x-[LIVMFA]-x(3)-[LIVMF]-x-G-[GSA]-C-x-[IVT]-P-[LIVMF]- 
[GSA] [C is the cofactor attachment site] 

10 

[ 1] Miller A.D., Hart G.J., Packman L.C., Battersby A.R. Biochem. J. 254:915-918(1988). 
441. Presenilin 

15 Mutations in presenilin-1 are a major cause of early onset Alzheimer's disease [2]. It has 
been found that presenilin-1 (Swiss:P49768) binds to beta-catenin in vivo [4]. This family 
also contains SPE proteins from C.elegans. 
Number of members: 23 
[1] 

2 0 Medline: 98045995 

Presenilins and Alzheimer's disease. 
Kim TW, Tanzi RE; 
Curr Opin Neurobiol 1997;7:683-688. 
[2]Medline: 98045995 

2 5 Presenilins and Alzheimer's disease. 

Kim TW, Tanzi RE; 
Curr Opin Neurobiol 1997;7:683-688. 
[3]Medline: 98099802 
Interaction of presenilins with the filamin family of 

3 0 actin-binding proteins. 

Zhang W, Han SW, McKeel DW, Goate A, Wu JY; 
J Neurosci 1998;18:914-922. 
[4]Medline: 99004850 
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Destabilisation of beta-catenin by mutations in presenilin-1 
potentiates neuronal apoptosis. 
Zhang X, Hartmann H, Do VM, Abramowski D, Sturchler-Pierrat 
C, Staufenbiel M, Sommer B, van de Wetering M, Clevers H, 
5 Saftig P, De Strooper B, He X, Yankner BA; 
Nature 1998;395:698-702. 

442. (Pribosyltran) Purine/pyrimidine phosphoribosyl transferases signature 
1 0 Phosphoribosyltransferases (PRT) are enzymes that catalyze the synthesis of 

beta-n-5'-monophosphates from phosphoribosylpyrophosphate (PRPP) and an enzyme 
specific amine. A number of PRT's are involved in the biosynthesis of purine, 
pyrimidine, and pyridine nucleotides, or in the salvage of purines and 
pyrimidines. These enzymes are: 
15 - Adenine phosphoribosyltransferase (EC 2.4.2.7) (APRT), which is involved in 
purine salvage. 

- Hypoxanthine-guanine or hypoxanthine phosphoribosyltransferase (EC 2.4.2.8) 
(HGPRT or HPRT), which are involved in purine salvage. 

- Orotate phosphoribosyltransferase (EC 2.4.2.10) (OPRT), which is involved 
2 0 in pyrimidine biosynthesis. 

- Amido phosphoribosyltransferase (EC 2.4.2.14), which is involved in purine 
biosynthesis. 

- Xanthine-guanine phosphoribosyltransferase (EC 2.4.2.22) (XGPRT), which is 
involved in purine salvage. 

2 5 In the sequence of all these enzymes there is a small conserved region which 

may be involved in the enzymatic activity and/or be part of the PRPP binding 
site [1]. 

-Consensus pattern: [LIVMFYWCTA]-[LIVM]-[LIVMA]-[LIVMFC]-[DE]-D-[LIVMS]- 

3 0 [LIVM]-[STAVD]-[STAR]-[GAC]-x-[STAR] 

-Note: in position 11 of the pattern most of these enzymes have Gly. 



[ 1] Hershey H.V., Taylor M.W. Gene 43:287-293(1986). 
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443. (Pro CA) 

Prokaryotic-type carbonic anhydrases signatures 

Carbonic anhydrases (EC 4.2.1.1) (CA) are zinc metalloenzymes which catalyze the 
reversible hydration of carbon dioxide. In Escherichia coli, CA (gene cynT) is involved in 
recycling carbon dioxide formed in the bicarbonate-dependent decomposition of cyanate by 
cyanase (gene cynS). By this action, it prevents the depletion of cellular bicarbonate [1]. In 
photosynthetic bacteria and plant chloroplast, CA is essential to inorganic carbon fixation [2]. 
Prokaryotic and plant chloroplast CA are structurally and evolutionary related and form a 
family distinct from the one which groups the many different forms of eukaryotic CA's (see 
<PDOC00146>). Hypothetical proteins yadF from Escherichia coli and HI1301 from 
Haemophilus influenzae also belong to this family. Two signature patterns were developed 
for this family of enzymes. Both patterns contain conserved residues that could be involved 
in binding zinc (cysteine and histidine). 

-Consensus pattern: C-[SA]-D-S-R-[LIVM]-x-[AP] 

-Consensus pattern: [EQ]-Y-A-[LIVM]-x(2)-[LIVM]-x(4)-[LIVMF](3)-x-G-H-x(2)-C-G 

[ 1] Guilloton M.B., Korte J.J., Lamblin A.F., Fuchs J.A., Anderson P.M. J. Biol. Chem. 
267:3731-3734(1992). 

[ 2] Fukuzawa H., Suzuki E., Komukai Y., Miyachi S. Proc. Natl. Acad. Sci. U.S.A. 
89:4437-4441(1992). 

444. (Prolyl_oligopep) 

Prolyl oligopeptidase family serine active site 

The prolyl oligopeptidase family [1,2,3] consist of a number of evolutionary related 
peptidases whose catalytic activity seems to be provided by a charge relay system similar to 
that of the trypsin family of serine proteases, but which evolved by independent convergent 
evolution. The known members of this family are listed below. 
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- Prolyl endopeptidase (EC 3.4.21.26) (PE) (also called post-proline cleaving enzyme). PE is 
an enzyme that cleaves peptide bonds on the C-terminal side of prolyl residues. The sequence 
of PE has been obtained from a mammalian species (pig) and from bacteria (Flavobacterium 
meningosepticum and Aeromonas hydrophila); there is a high degree of sequence 
conservation between these sequences. 

- Escherichia coli protease II (EC 3.4.21.83) (oligopeptidase B) (gene prtB) which cleaves 
peptide bonds on the C-terminal side of lysyl and argininyl residues. 

- Dipeptidyl peptidase IV (EC 3.4.14.5) (DPP IV). DPP IV is an enzyme that removes N- 
terminal dipeptides sequentially from polypeptides having unsubstituted N-termini provided 
that the penultimate residue is proline. 

- Yeast vacuolar dipeptidyl aminopeptidase A (DPAP A) (gene: STE13) which is responsible 
for the proteolytic maturation of the alpha-factor precursor. 

- Yeast vacuolar dipeptidyl aminopeptidase B (DPAP B) (gene: DAP2). 

- Acylamino-acid-releasing enzyme (EC 3.4.19.1) (acyl-peptide hydrolase). 

This enzyme catalyzes the hydrolysis of the amino-terminal peptide bond of an N-acetylated 
protein to generate a N-acetylated amino acid and a protein with a free amino-terminus. 

A conserved serine residue has experimentally been shown (in E.coli proteasell as well as in 
pig and bacterial PE) to be necessary for the catalytic mechanism. This serine, which is part 
of the catalytic triad (Ser, His, Asp), is generally located about 150 residues away from the C- 
terminal extremity of these enzymes (which are all proteins that contains about 700 to 800 
amino acids). 

Consensus pattern: D-x(3)-A-x(3)-[LIVMFYW]-x(14)-G-x-S-x-G-G-[LIVMFYW](2) [S is 
the active site residue] Sequences known to belong to this class detected by the pattern ALL, 
except for yeast DPAP A. 

Note: these proteins belong to families S9A/S9B/S9C in the classification of peptidases [4]. 



[ 1] Rawlings N.D., Polgar L., Barrett A.J. Biochem. J. 279:907-911(1991). 
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[ 2] Barrett A.J., Rawlings N.D. 

[ 3] Polgar L., Szabo E. 

□ 

5 [ 4] Rawlings N.D., Barrett A J. Meth. Enzymol. 244:19-61(1994). 

445. (Pterin 4a) 

Pterin 4 alpha carbinolamine dehydratase 

10 

Pterin 4 alpha carbinolamine dehydratase is aka DCoH (dimerisation cofactor of hepatocyte 
nuclear factor 1 -alpha). 

Number of members: 1 1 

15 

[1] Cronk JD, Endrizzi JA, Alber T; Medline: 97052967 "High-resolution structures of the 
bifunctional enzyme and transcriptional coactivator DCoH and its complex with a product 
analogue." Protein Sci 1996;5:1963-1972. 

20 

446. (Pyridox oxidase) 

Pyridoxamine 5'-phosphate oxidase signature 

Pyridoxamine 5'-phosphate oxidase (EC 1.4.3.5) is a FMN flavoprotein involved in the de 
25 novo synthesis of pyridoxine (vitamin B6) and pyridoxal phosphate. It oxidizes 

pyridoxamine-5-P (PMP) and pyridoxine-5-P (PNP) to pyridoxal-5-P. The sequences of the 
enzyme from bacterial (genes pdxH or fprA) [1] and fungal (gene PDX3) [2] sources show 
that this protein has been highly conserved throughout evolution. 

PdxH is evolutionary related [3] to one of the enzymes in the phenazine biosynthesis 
30 protein pathway, phzD (also known as phzG). As a signature pattern, a highly conserved 
region was selected located in the C-terminal part of these enzymes. 

-Consensus pattern: [LIVF]-E-F-W-[0HG]-x(4)-R-[LIVM]-H-[DNE]-R 



Reference No. 2750-942P 



403 

[ 1] Lam H.-M., Winkler M.E. J. Bacteriol. 174:6033-6045(1992). 

[ 2] Loubbardi A., Karst F., Guilloton M., Marcireau C. J. Bacteriol. 177:1817-1823(1995). 
[ 3] Pierson L.S. Ill, Gaffney T., Lam S., Gong F. FEMS Microbiol. Lett. 134:299- 
5 307(1995). 

447. (Pyrophosphatase) 

Inorganic pyrophosphatase signature 

10 

Inorganic pyrophosphatase (EC 3.6.1.1) (PPase) [1,2] is the enzyme responsible for the 
hydrolysis of pyrophosphate (PPi) which is formed principally as the product of the many 
biosynthetic reactions that utilize ATP. All known Ppases require the presence of divalent 
metal cations, with magnesium conferring the highest activity. Among other residues, a 

1 5 lysine has been postulated to be part or close to the active site. PPases have been sequenced 
from bacteria such as Escherichia coli (homohexamer), thermophilic bacteria PS-3 and 
Thermus thermophilus, from the archaebacteria Thermoplasma acidophilum, from fungi 
(homodimer), from a plant, and from bovine retina. In yeast, a mitochondrial isoform of 
PPase has been characterized which seems to be involved in energy production and whose 

2 0 activity is stimulated by uncouplers of ATP synthesis. 

The sequences of PPases share some regions of similarities. As signature patterns a region 
was selected that contains three conserved aspartates that are involved in the binding of 
cations. 

25 

-Consensus pattern: D-[SGDN]-D-[PE]-[LIVMF]-D-[LIVMGAC] 
[The three D's bind divalent metal cations] 

[ 1] Lahti R., Kolakowski L.F. Jr., Heinonen J., Vihinen M., Pohjanoksa K., Cooperman 
30 B.S. Biochim. Biophys. Acta 1038:338-345(1990). 

[ 2] Cooperman B.S., Baykov A.A., Lahti R. Trends Biochem. Sci. 17:262-266(1992). 
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448. (Peptidase S26) 

Signal peptidases I signatures. 

Signal peptidases (SPases) [1] (aka leader peptidases) remove the signal peptides from 
secretory proteins. In prokaryotes three types of SPasesare known: type I (gene lepB) which 
is responsible for the processing of the majority of exported pre-proteins; type II (gene Isp) 
which only process lipoproteins, and a third type involved in the processing of pili subunits. 
SPase I (EC 3 .4.2 1.89) is an integral membrane protein that is anchored in the cytoplasmic 
membrane by one (in B. subtilis) or two (in E. coli) N-terminal transmembrane domains with 
the main part of the protein protuding in the periplasmic space. Two residues have been 
shown [2,3] to be essential for the catalytic activity of SPase I: a serine and an lysine. SPase I 
is evolutionary related to the yeast mitochondrial inner membrane protease subunit 1 and 2 
(genes IMPl and IMP2) which catalyze the removal of signal peptides required for the 
targeting of proteins from the mitochondrial matrix, across the inner membrane, into the 
inter-membrane space [4]. In eukaryotes the removal of signal peptides is effected by an 
oligomeric enzymatic complex composed of at least five subunits: the signal peptidase 
complex (SPC). The SPC is located in the endoplasmic reticulum membrane. Two 
components of mammalian SPC, the 18 Kd (SPC18) and the 21 Kd (SPC21) subunits as well 
as the yeast SECll subunit have been shown [5] to share regions of sequence similarity with 
prokaryotic SPases I and yeast IMP1/IMP2. Three signature patterns have been developed for 
these proteins. The first signature contains the putative active site serine, the second signature 
contains the putative active site lysine which is not conserved in the SPC subunits, and the 
third signature corresponds to a conserved region of unknown biological significance which 
is located in the C-terminal section of all these proteins. 

Consensus pattern: [GS]-x-S-M-x-[PS]-[AT]-[LF] [S is an active site residue]- 

Consensus pattern: K-R-[LIVMSTA](2)-G-x-[PG]-G-[DE]-x-[LIVM]-x-[LIVMFY] [K is an 

active site residue] - 

Consensus pattern: [LIVMFYW](2)-x(2)-G-D-[NH]-x(3)-[SND]-x(2)-[SG]- 

[ 1] Dalbey R.E., von Heijne G. Trends Biochem. Sci. 17:474-478(1 992).[ 2] Sung M., 
Dalbey R.E. J. Biol. Chem. 267:13154-13159(1992).[ 3] Black M.T. J. Bacteriol. 175:4957- 
4961(1993).[ 4] Nunnari J., Fox T.D., Walter P. Science 262:1997-2004(1993).[ 5] van Dijl 
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J.M., de Jong A., Vehmaanpera J., Venema G., Bron S. EMBO J. 11:2819-2828(1992).[ 6] 
Rawlings N.D., Barrett A.J. Meth. Enzymol. 244:19-61(1994).[E1] 



5 449. (Peptidase CI) Eukaryotic thiol (cysteine) proteases active sites. Eukaryotic thiol 

proteases (EC 3.4.22.-) [1] are a family of proteolytic enzymes which contain an active site 
cysteine. Catalysis proceeds through a thioester intermediate and is facilitated by a nearby 
histidine side chain; an asparagine completes the essential catalytic triad. The proteases 
which are currently known to belong to this family are listed below (references are only 

1 0 provided for recently determined sequences). - Vertebrate lysosomal cathepsins B (EC 
3.4.22.1), H (EC 3.4.22.16), L (EC 3.4.22.15), and S (EC 3.4.22.27 ) [2]. - Vertebrate 
lysosomal dipeptidyl peptidase I (EC 3.4.14.1 ) (also known as cathepsin C) [2]. - Vertebrate 
calpains (EC 3.4.22. 17). Calpains are intracellular calcium- activated thiol protease that 
contain both a N-terminal catalytic domain and a C-terminal calcium-binding domain. - 

15 Mammalian cathepsin K, which seems involved in osteoclastic bone resorption [3]. - Human 
cathepsin O [4]. - Bleomycin hydrolase. An enzyme that catalyzes the inactivation of the 
antitumor drug BLM (a glycopeptide). - Plant enzymes: barley aleurain (EC 3.4.22.16 ), EP- 
B1/B4; kidney bean EP-Cl, rice bean SH-EP; kiwi fruit actinidin (EC 3.4.22.14 ); papaya 
latex papain (EC 3.4.22.2), chymopapain (EC 3.4.22.6), caricain (EC 3.4. 22.30), and 

20 proteinase IV (EC 3.4.22.25); pea turgor-responsive protein 15A; pineapple stem bromelain 
(EC 3.4.22.32); rape COT44; rice oryzain alpha, beta, and gamma; tomato low-temperature 
induced, Arabidopsis thaliana A494, RD19A and RD21A. - House-dust mites allergens 
DerPl and EurMl. - Cathepsin B-like proteinases from the worms Caenorhabditis elegans 
(genes gcp-1, cpr-3, cpr-4, cpr-5 and cpr-6), Schistosoma mansoni (antigen SM31) and 

25 Japonica (antigen SJ31), Haemonchus contortus (genes AC-1 and AC-2), and Ostertagia 

ostertagi (CP-1 and CP-3). - Slime mold cysteine proteinases CPl and CP2. - Cruzipain from 
Trypanosoma cruzi and brucei. - Throphozoite cysteine proteinase (TCP) from various 
Plasmodium species. - Proteases from Leishmania mexicana, Theileria annulata and Theileria 
parva. - Baculoviruses cathepsin-like enzyme (v-cath). - Drosophila small optic lobes protein 

3 0 (gene sol), a neuronal protein that contains a calpain-like domain. - Yeast thiol protease 

BLH1/YCP1/LAP3. - Caenorhabditis elegans hypothetical protein C06G4.2, a calpain-like 
protein. Two bacterial peptidases are also part of this family: - Aminopeptidase C from 
Lactococcus lactis (gene pepC) [5]. - Thiol protease tpr from Porphyromonas gingivalis. 
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Three other proteins are structurally related to this family, but may have lost their proteolytic 
activity. - Soybean oil body protein P34. This protein has its active site cysteine replaced by a 
glycine. - Rat testin, a Sertoli cell secretory protein highly similar to cathepsin L but with the 
active site cysteine is replaced by a serine. Rat testin should not be confused with mouse 
5 testin which is a LIM-domain protein (see < PDOC00382 >). - Plasmodium falciparum serine- 
repeat protein (SERA), the major blood stage antigen. This protein of 111 Kd possesses a C- 
terminal thiol-protease-like domain [6], but the active site cysteine is replaced by a serine. 
The sequences around the three active site residues are well conserved and can be used as 
signature patterns. 

10 

Consensus pattern: Q-x(3)-[GE]-x-C-[YW]-x(2)-[STAGC]-[STAGCV] [C is the active site 
residue]- Note: the residue in position 4 of the pattern is almost always cysteine; the only 
exceptions are calpains (Leu), bleomycin hydrolase (Ser) and yeast YCPl (Ser). -Note: the 
residue in position 5 of the pattern is always Gly except in papaya protease IV where it is 
15 Glu. 

Consensus pattern: [LIVMGSTAN]-x-H-[GSACE]-[LIVM]-x-[LIVMAT](2)-G-x- 
[GSADNH] [H is the active site residue] - 

Consensus pattern: [FYCH]-[WI]-[LIVT]-x-[KRQAG]-N-[ST]-W-x(3)-[FYW]-G-x(2)-G- 
[LFYW]-[LIVMFYG]-x-[LIVMF] [N is the active site residue] - Note: these proteins belong 
2 0 to family CI (papain-type) and C2 (calpains) in the classification of peptidases | 7.F. I ].- 

[ 1] Dufour E. Biochimie 70: 1335-1342(1 988).[ 2] Kirschke H., Barrett A.J., Rawlings N.D. 
Protein Prof. 2:1587-1643(1995).[ 3] Shi G.-P., Chapman H.A., Bhairi S.M., Deleeuw C, 
Reddy V.Y., Weiss S.J. FEES Lett. 357:129-134(1995).[ 4] Velasco G., Ferrando A.A., 
25 Puente X.S., Sanchez L.M., Lopez-Otin C. J. Biol. Chem. 269:27136-27142(1994).[ 5] 
Chapot-Chartier M.P., Nardi M., Chopin M.C., Chopin A., Gripon J.C. Appl. Environ. 
Microbiol. 59:330-333(1993).[ 6] Higgins D.G., McConnell D.J., Sharp P.M. Nature 
340:604-604(1989).[ 7] Rawlings N.D., Barrett A.J. Meth. Enzymol. 244:461-486(1994). 

30 

450. (peptidase M24) Aminopeptidase P and proline dipeptidase signature (1). 
Aminopeptidase P (EC 3.4.11.9 ) is the enzyme responsible for the release of any N-terminal 
amino acid adjacent to a proline residue. Proline dipeptidase(EC 3.4. 13.9 ) (prolidase) splits 
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dipeptides with a prolyl residue in the carboxyl terminal position. Bacterial aminopeptidase P 
II (gene pepP) [1], proline dipeptidase (gene pepQ)[2], and human proline dipeptidase (gene 
PEPD) [3] are evolutionary related. These proteins are manganese metalloenzymes. Yeast 
hypothetical proteins YER078c and YFR006w and Mycobacterium tuberculosis hypothetical 
protein MtCY49.29c also belong to this family. As a signature pattern for these enzymes a 
conserved region that contains three histidine residues has been developed 

Consensus pattern: [HA]-[GSYR]-[LIVMT]-[SG]-H-x-[LIV]-G-[LIVM]-x-[IV]-H-[DE]- 

[ 1] Yoshimoto T., Tone H., Honda T., Osatomi K., Kobayashi R., Tsuru D. T Biochem. 
105:412-416(1989).[ 2] Nakahigashi K., Inokuchi H. Nucleic Acids Res. 18:6439- 
6439(1990).[ 3] Endo F., Tanoue A., Nakai H., Hata A., Indo Y., Titani K., Matsuda I. J. 
Biol. Chem. 264:4476-4481(1989).[ 4] Rawlings N.D., Barrett A.J. Meth. EnzymoL 248:183- 
228(1995). 

Methionine aminopeptidase signatures. (2). Methionine aminopeptidase (EC 3.4.] 1.18) 
(MAP) is responsible for the removal of the amino-terminal (initiator) methionine from 
nascent eukaryotic cytosolic and cytoplasmic prokaryotic proteins if the penultimate amino 
acid is small and uncharged. All MAP studied to date are monomeric proteins that require 
cobalt ions for activity. Two subfamilies of MAP enzymes are known to exist [1,2]. While 
being evolutionary related, they only share a limited amount of sequence similarity mostly 
clustered around the residues shown, in the Escherichia coli MAP [3],to be involved in 
cobalt-binding. The first family consists of enzymes from prokaryotes as well as 
eukaryoticMAP-1, while the second group is made up of archebacterial MAP and 
eukaryoticMAP-2. The second subfamily also includes proteins which do not seem to be 
MAP, but that are clearly evolutionary related such as mouse proliferation-associated protein 
1 and fission yeast curved DNA-binding protein. For each of these subfamilies, a specific 
signature pattern that includes residues known to be involved in colbalt-binding has been 
developed. 

Consensus pattern: [MFY]-x-G-H-G-[LIVMC]-[GSH]-x(3)-H-x(4)-[LIVM]-x-[HN]- [YWV] 
[H is a cobalt ligand]- 
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Consensus pattern: [DA]-[LIVMY]-x-K-[LIVM]-D-x-G-x-[HQ]-[LIVM]-[DNS]-G-x(3)- 
[DN] [The second D and the last D/N are cobalt ligands] 

[ 1] Arfin S.M., Kendall R.L., Hall L., Weaver L.H., Stewart A.E., Matthews B.W., 
Bradshaw R.A. Proc. Natl. Acad. Sci. U.S.A. 92:7714-7718(1995).[ 2] Keeling P.J., Doolittle 
W.F. Trends Biochem. Sci. 21:285-286(1996).[ 3] Roderick S.L., Mathews B.W. 
Biochemistry 32:3907-3912(1993). [ 4] Rawlings N.D., Barrett A.J. Meth. Enzymol. 248:183- 
228(1995). 

451. Cytochrome P450 cysteine heme-iron ligand signature 

Cytochrome P450's [1,2,3,E1] are a group of enzymes involved in the oxidative metabolism 
of a high number of natural compounds (such as steroids, fatty acids, prostaglandins, 
leukotrienes, etc) as well as drugs, carcinogens and mutagens. Based on sequence similarities, 
P450's have been classified into about forty different families [4,5]. P450's are proteins of 
400 to 530 amino acids; the only exception is Bacillus BM-3 (CYP102) which is a protein of 
1048residues that contains a N-terminal P450 domain followed by a reductase domain. 
P450's are heme proteins. A conserved cysteine residue in the C-terminal part of P450's is 
involved in binding the heme iron in the fifth coordination site. From a region around this 
residue, a ten residue signature was developed specific to P450's. 

Consensus pattern: [FW]-[SGNH]-x-[GD]-x-[RHPT]-x-C-[LIVMFAP]-[GAD] [C is the 
heme iron ligand] - 

[ 1] Nebert D.W., Gonzalez F.J. Annu. Rev. Biochem. 56:945-993(1987). 

[ 2] Coon M.J., Ding X., Pernecky S.J., Vaz A.D.N. FASEB J. 6:669-673(1992). 

[ 3] Guengerich P.P. J. Biol. Chem. 266:10019-10022(1991). 

[ 4] Nelson D.R., Kamataki T., Waxman D.J., Guengerich F.P., Estrabrook R.W., Feyereisen 
R., Gonzalez F.J., Coon M.J., Gunsalus I.C., Gotoh O., Okuda K., Nebert D.W. DNA Cell 
Biol. 12:1-51(1993). 

[ 5] Degtyarenko K.N., Archakov A.I. FEES Lett. 332:1-8(1993). 
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452. (Pec Lyase) Pectate lyase 

This enzyme forms a right handed beta helix structure. Pectate lyase is an enzyme 
involved in the maceration and soft rotting of plant tissue. 

[1] Yoder MD, Keen NT, Jurnak F, Science 1993;260:1503-1507. 

453. (pep M24) Aminopeptidase P and proline dipeptidase signature (pepl) 
Aminopeptidase P (EC 3.4.11.9 ) is the enzyme responsible for the release of any N-terminal 
amino acid adjacent to a proline residue. Proline dipeptidase(EC 3.4.13.9 ) (prolidase) splits 
dipeptides with a prolyl residue in the carboxyl terminal position. Bacterial aminopeptidase P 
II (gene pepP) [1], proline dipeptidase (gene pepQ)[2], and human proline dipeptidase (gene 
PEPD) [3] are evolutionary related. These proteins are manganese metalloenzymes. Yeast 
hypothetical proteins YER078c and YFR006w and Mycobacterium tuberculosis .hypothetical 
protein MtCY49.29c also belong to this family. As a signature pattern for these enzymes a 
conserved region was selected that contains three histidine residues. 

Consensus pattern: [HA]-[GSYR]-[LIVMT]-[SG]-H-x-[LIV]-G-[LIVM]-x-[IV]-H-[DE]- 

[ 1] Yoshimoto T., Tone H., Honda T., Osatomi K., Kobayashi R., Tsuru D. J. Biochem. 
105:412-416(1989). 

[ 2] Nakahigashi K., Inokuchi H. Nucleic Acids Res. 18:6439-6439(1990). 

[ 3] Endo P., Tanoue A., Nakai H., Hata A., Indo Y., Titani K., Matsuda I. J. Biol. Chem. 

264:4476-4481(1989). 

[ 4] Rawlings N.D., Barrett A.J. Meth. Enzymol. 248:183-228(1995). 
Methionine aminopeptidase signatures (pep2) 

Methionine aminopeptidase (EC 3.4.11.18 ) (MAP) is responsible for the removal of the 
amino-terminal (initiator) methionine from nascent eukaryotic cytosolic and cytoplasmic 
prokaryotic proteins if the penultimate amino acid is small and uncharged. All MAP studied 
to date are monomeric proteins that require cobalt ions for activity. Two subfamilies of MAP 
enzymes are known to exist [1,2]. While being evolutionary related, they only share a limited 
amount of sequence similarity mostly clustered around the residues shown, in the Escherichia 
coli MAP [3],to be involved in cobalt-binding. The first family consists of enzymes from 
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prokaryotes as well as eukaryotic MAP-1, while the second group is made up of 
archebacterial MAP and eukaryotic MAP-2. The second subfamily also includes proteins 
which do not seem to be MAP, but that are clearly evolutionary related such as mouse 
proliferation-associated protein 1 and fission yeast curved DNA-binding protein. For each of 
these subfamilies, a specific signature pattern was developed that includes residues known to 
be involved in colbalt-binding. 

Consensus pattern: [MFY]-x-G-H-G-[LIVMC]-[GSH]-x(3)-H-x(4)-[LIVM]-x-[HN]- [YWV] 
[H is a cobalt ligand]- 

Consensus pattern: [DA]-[LIVMY]-x-K-[LIVM]-D-x-G-x-[HQ]-[LIVM]-[DNS]-G-x(3)- 
[DN] [The second D and the last D/N are cobalt ligands] 

[ 1] Arfin S.M., Kendall R.L., Hall L., Weaver L.H., Stewart A.E., Matthews B.W., 

Bradshaw R.A. Proc. Natl. Acad. Sci. U.S.A. 92:7714-7718(1995). 

[ 2] Keeling F.J., Doolittle W.F. Trends Biochem. Sci. 21:285-286(1996). 

[ 3] Roderick S.L., Mathews B.W. Biochemistry 32:3907-3912(1993). 

[ 4] Rawlings N.D., Barrett A.J. Meth. Enzymol. 248:183-228(1995). 

454. Peroxidases signatures 

Peroxidases (EC 1.11.1.-) [1] are heme-binding enzymes that carry out a variety of 
biosynthetic and degradative functions using hydrogen peroxide as the electron acceptor. 
Peroxidases are widely distributed throughout bacteria, fungi, plants, and vertebrates. In 
peroxidases the heme prosthetic group is protoporphyrin IX and the fifth ligand of the heme 
iron is a histidine (known as the proximal histidine). Another histidine residue (the distal 
histidine) serves as an acid-base catalyst in the reaction between hydrogen peroxide and the 
enzyme. The regions around these two active site residues are more or less conserved in a 
majority of peroxidases [2,3]. The enzymes in which one or both of these regions can be 
found are listed below. - Yeast cytochrome c peroxidase (EC 1.11.1.5 ). - Myeloperoxidase 
(EC 1.11.1.7 ) (MPO). MPO is found in granulocytes and monocytes and plays a major role in 
the oxygen-dependent microbicidal system of neutrophils. - Lactoperoxidase (EC 1.11.1.7 ) 
(LPO). LPO is a milk protein which acts as an antimicrobial agent. - Eosinophil peroxidase 
(EC 1.11.1.7 ) (EPO). An enzyme found in the cytoplasmic granules of eosinophils. - Thyroid 
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peroxidase (EC 1.11.1.8 ) (TPO). TPO plays a central role in the biosynthesis of thyroid 
hormones. It catalyzes the iodination and coupling of the hormonogenic tyrosines in 
thyroglobulin to yield the thyroid hormones T3 and T4. - Fungal ligninases. Ligninase 
catalyzes the first step in the degradation of lignin. It depolymerizes lignin by catalyzing the 
5 C(alpha)-C(beta) cleavage of the propyl side chains of lignin. - Plant peroxidases (EC 

1.11.1.7 ). Plants expresses a large numbers of isozymes of peroxidases. Some of them play a 
role in cell-suberization by catalyzing the deposition of the aromatic residues of suberin on 
the cell wall, some are expressed as a defense response toward wounding, others are involved 
in the metabolism of auxin and the biosynthesis of lignin. - Prokaryotic catalase -peroxidases. 
1 0 Some bacterial species produce enzymes that exhibit both catalase and broad-spectrum 

peroxidase activities [4]. Examples of such enzymes are: catalase HP I from Escherichia coli 
(gene katG) and perA from Bacillus stearothermophilus. 

Consensus pattern: [DET]-[LIVMTA]-x(2)-[LIVM]-[LIVMSTAG]-[SAG]-[LIVMSTAG]-H- 
1 5 [STA]-[LIVMFY] [H is the proximal heme-binding ligand] - 

Consensus pattern: [SGATV]-x(3)-[LIVMA]-R-[LIVMA]-x-[FW]-H-x-[SAC] [H is an active 
site residue] - 

[ 1] Dawson J.H. Science 240:433-439(1988). 
20 [2] Kimura S., Ikeda-Saito M. Proteins 3:113-120(1988). 

[ 3] Henrissat B., Saloheimo M., Lavaitte S., Knowles J.K.C. Proteins 8:251-257(1990). 
[ 4] Welinder K.G. Biochim. Biophys. Acta 1080:215-220(1991). 

2 5 455. pfkB family of carbohydrate kinases signatures 

It has been shown [1,2,3] that the following carbohydrate and purine kinasesare evolutionary 
related and can be grouped into a single family, which isknown [1] as the 'pfkB family': - 
Fructokinase (EC 2.7.1.4 ) (gene scrK). - 6-phosphofructokinase isozyme 2 (EC 2.7.1.11) 
(phosphofructokinase-2) (gene pfkB). pfkB is a minor phosphofructokinase isozyme in 

3 0 Escherichia coli and is not evolutionary related to the major isozyme (gene pfkA). Plants 6- 

phosphofructokinase also belong to this family. - Ribokinase (EC 2.7.1.15 ) (gene rbsK). - 
Adenosine kinase (EC 2.7.1.20 ) (gene ADK). - 2-dehydro-3-deoxygluconokinase (EC 
2.7.1.45 ) (gene: kdgK). - 1 -phosphofructokinase fEC 2.7.1.56 ) (fructose 1-phosphate kinase) 
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(gene fruK). - Inosine-guanosine kinase (EC 2.7.1.73 ) (gene gsk). - Tagatose-6-phosphate 
kinase (EC 2.7.1.144 ) (phosphotagatokinase) (gene lacC). - Escherichia coli hypothetical 
protein yeiC. - Escherichia coli hypothetical protein yeil. - Escherichia coli hypothetical 
protein yhfQ. - Escherichia coli hypothetical protein yihV. - Bacillus subtilis hypothetical 
protein yxdC. - Yeast hypothetical protein YJRlOSw.All the above kinases are proteins of 
from 280 to 430 amino acid residues that share a few region of sequence similarity. Two of 
these regions were selected as signature patterns. The first pattern is based on a region rich in 
glycine which is located in the N-terminal section of these enzymes; while the second pattern 
is based on a conserved region in the C-terminal section. 

Consensus pattern: [AG]-G-x(0,l)-[GAP]-x-N-x-[STA]-x(6)-[GS]-x(9)-G- 

Consensus pattern: [DNSK]-[PSTV]-x-[SAG](2)-[GD]-D-x(3)-[SAGV]-[AG]- [LIVMFYA]- 

[LIVMSTAP] 

[ 1] Wu L.-F., Reizer A., Reizer J., Cai B., Tomich J.M., Saier M.H. Jr. J. Bacteriol. 
173:3117-3127(1991). 

[ 2] Orchard L.M.D., Kornberg H.L. Proc. R. Soc. Lond., B, Biol. Sci. 242:87-90(1990). 
[ 3] Blatch G.L., Scholle R.R., Woods D.R. Gene 95:17-23(1990). 

456. Phospholipase A2 active sites signatures 

Phospholipase A2 (EC 3.1.1.4 1 (PA2) [1,2] is an enzyme which releases fatty acids from the 
second carbon group of glycerol. PA2's are small and rigid proteins of 120 amino-acid 
residues that have four to seven disulfide bonds.PA2 binds a calcium ion which is required 
for activity. The side chains of two conserved residues, a histidine and an aspartic acid, 
participate in a 'catalytic network'. Many PA2's have been sequenced from snakes, lizards, 
bees and mammals. In the latter, there are at least four forms: pancreatic, membrane- 
associated as well as two less characterized forms. The venom of most snakes contains 
multiple forms of PA2. Some of them are presynaptic neurotoxins which inhibit 
neuromuscular transmission by blocking acetylcholine release from the nerve termini. Two 
different signature patterns were derived for PA2's. The first is centered on the active site 
histidine and contains three cysteines involved in disulfide bonds. The second is centered on 
the active site aspartic acid and also contains three cysteines involved in disulfide bonds. 
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Consensus pattern: C-C-x(2)-H-x(2)-C [H is the active site residue] This pattern will not 
detect some snake toxins homologous with PA2 but which have lost their catalytic activity as 
well as otoconin-22, a Xenopus protein from the aragonitic otoconia which is also unlikely to 
be enzymatically active. 

Consensus pattern: [LIVMA]-C-{LIVMFYWPCST}-C-D-x(5)-C [D is the active site 
residue] The majority of functional and non-functional PA2's. Undetected sequences 

are bee PA2, gila monster PA2's, PA2 PL-X from habu and PA2 PA-5 from mulga. 

[ 1] Davidson F.F., Dennis E.A. J. Mol. Evol. 31:228-238(1990). 

[ 2] Gomez P., Vandermeers A., Vandermeers-Piret M.-C, Herzog R., Rathe J., Stievenart 
M., Winand J., Christophe J. Eur. J. Biochem. 186:23-33(1989). 

457. Phosphorylase pyridoxal-phosphate attachment site. Phosphorylases (EC 2.4.1 .1 ) [1] are 
important allosteric enzymes in carbohydrate metabolism. They catalyze the formation of 
glucose 1-phosphatefrom polyglucose such as glycogen, starch or maltodextrin. Enzymes 
from different sources differ in their regulatory mechanisms and their natural substrates. 
However, all known phosphorylases share catalytic and structural properties. They are 
pyridoxal-phosphate dependent enzymes; the pyridoxal-P group is attached to a lysine 
residue around which the sequence is highly conserved and can be used as a signature pattern 
to detect this class of enzymes. 

Consensus pattern: E-A-[SC]-G-x-[GS]-x-M-K-x(2)-[LM]-N [K is the pyridoxal-P 
attachment site]- 

[ 1] Fukui T., Shimomura S., Nakano K. Mol. Cell. Biochem. 42:129-144(1982). 

458. Protein kinases signatures and profile 

Eukaryotic protein kinases [1 to 5] are enzymes that belong to a very extensive family of 
proteins which share a conserved catalytic core common toboth serine/threonine and tyrosine 
protein kinases. There are a number ofconserved regions in the catalytic domain of protein 
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kinases. Two of these regions were selected to build signature patterns. The first region, 
which is located in the N-terminal extremity of the catalytic domain, is a glycine-rich stretch 
of residues in the vicinity of a lysine residue, which has been shown to be involved in ATP 
binding. The second region, which is located in the central part of the catalytic domain, 
contains a conserved aspartic acid residue which is important for the catalytic activity of the 
enzyme [6]; Two signature patterns were derived for that region: one specific for 
serine/threonine kinases and the other for tyrosine kinases. A profile was also developed 
which is based on the alignment in [1] and covers the entire catalytic domain. 

Consensus pattern: [LIV]-G-{P}-G-{P}-[FYWMGSTNH]-[SGA]-{PW}-[LIVCAT]-{PD}-x- 
[GSTACLIVMFY]-x(5,18)-[LIVMFYWCSTAR]-[AIVP]-[LIVMFAGCKR]-K [K binds 
ATP]. The majority of known protein kinases belong to the class detected by this pattern, but 
it fails to find a number of them, especially viral kinases which are quite divergent in this 
region and are completely missed by this pattern. 

Consensus pattern: [LIVMFYC]-x-[HY]-x-D-[LIVMFY]-K-x(2)-N-[LIVMFYCT](3) [D is 
an active site residue]. Most serine/ threonine specific protein kinases belong to this class 
detected by the pattern with 10 exceptions (half of them viral kinases) and also Epstein-Barr 
virus BGLF4 and Drosophila ninaC which have respectively Ser and Arg instead of the 
conserved Lys and which are therefore detected by the tyrosine kinase specific pattern 
described below. 

Consensus pattern: [LIVMFYC]-x-[HY]-x-D-[LIVMFY]-[RSTAC]-x(2)-N-[LIVMFYC](3) 
[D is an active site residue] ALL tyrosine specific protein kinases with the exception of 
human ERBB3 and mouse blk belong to this class detected by the pattern. This pattern will 
also detect most bacterial aminoglycoside phosphotransferases [8,9] and herpesviruses 
gangciclovir kinases [10]; which are proteins structurally and evolutionary related to protein 
kinases. This profile also detects receptor guanylate cyclases and 2-5A-dependent 
ribonucleases. Sequence similarities between these two families and the eukaryotic protein 
kinase family have been noticed before. It also detects Arabidopsis thaliana kinase- like 
protein TMKLl which seems to have lost its catalytic activity. If a protein analyzed includes 
the two protein kinase signatures, the probability of it being a protein kinase is close to 100%. 
Eukaryotic-type protein kinases have also been found in prokaryotes such as Myxococcus 
xanthus [11] and Yersinia pseudotuberculosis. 
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[ 1] Hanks S.K., Hunter T. FASEB J. 9:576-596(1995). 

[ 2] Hunter T. Meth. Enzymol. 200:3-37(1991). 

[ 3] Hanks S.K., Quinn A.M. Meth. Enzymol. 200:38-62(1991). 

[ 4] Hanks S.K. Curr. Opin. Struct. Biol. 1:369-383(1991). 

[ 5] Hanks S.K., Quinn A.M., Hunter T. Science 241:42-52(1988). 

[ 6] Knighton D.R., Zheng J., Ten Eyck L.F., Ashford V.A., Xuong N.-H., Taylor S.S., 

Sowadski J.M. Science 253:407-414(1991). 

[ 7] Bairoch A., Claverie J.-M. Nature 331:22(1988). 

[ 8] Benner S. Nature 329:21-21(1987). 

[ 9] Kirby R. J. Mol. Evol. 30:489-492(1992). 

[10] Littler E., Stuart A.D., Chee M.S. Nature 358:160-162(1992). 

[11] Munoz-Dorado J., Inouye S., Inouye M. Cell 67:995-1006a991V 

Receptor tyrosine kinase class II signature 

A number of growth factors stimulate mitogenesis by interacting with a familyof cell surface 
receptors which possess an intrinsic, ligand-sensitive, protein tyrosine kinase activity [1]. 
These receptor tyrosine kinases (RTK)all share the same topology: an extracellular ligand- 
binding domain, a single transmembrane region and a cytoplasmic kinase domain. However 
they can be classified into at least five groups. The prototype for class II RTK's is the insulin 
receptor, a heterotetramer of two alpha and two beta chains linked by disulfide bonds. The 
alpha and beta chains are cleavage products of a precursor molecule. The alpha chain 
contains the ligand binding site, the beta chain transverses the membrane and contains the 
tyrosine protein kinase domain. The receptors currently known to belong to class II are: - 
Insulin receptor from vertebrates. - Insulin growth factor I receptor from mammals. - Insulin 
receptor-related receptor (IRR), which is most probably a receptor for a peptide belonging to 
the insulin family. - Insects insulin-like receptors. - Molluscan insulin-related peptide(s) 
receptor (MIP-R). - Insulin-like peptide receptor from Branchiostoma lanceolatum. - The 
Drosophila developmental protein sevenless, a putative receptor for positional information 
required for the formation of the R7 photoreceptor cells. - The trk family of receptors 
(NTRKl, NTRK2 and NTRK3), which are high affinity receptors for nerve growth factor and 
related neurotrophic factors (BDNF and NT-3).And the following uncharacterized receptors: 
- ROS. - LTK (TYKl). - EDDRl (cak, TRKE, RTK6). - NTRK3 (TyrolO, TKT). - A sponge 
putative receptor tyrosine kinase. While only the insulin and the insulin growth factor I 
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receptors are known to exist in the tetrameric conformation specific to class II RTK's, all the 
above proteins share extensive homologies in their kinase domain, especially around the 
putative site of autophosphorylation. Hence, a signature pattern was developed for this class 
of RTK's, which includes the tyrosine residue, itself probably autophosphorylated. 

Consensus pattern: [DN]-[LIV]-Y-x(3)-Y-Y-R [The second Y is the autophosphorylation 
site] 

[ 1] Yarden Y., Ullrich A. Annu. Rev. Biochem. 57:443-478(1988). 
Receptor tyrosine kinase class III signature 

A number of growth factors stimulate mitogenesis by interacting with a family of cell surface 
receptors which possess an intrinsic, ligand-sensitive, protein tyrosine kinase activity [1]. 
These receptor tyrosine kinases (RTK)all share the same topology: an extracellular ligand- 
binding domain, a single transmembrane region and a cytoplasmic kinase domain. However 
they can be classified into at least five groups. The class III RTK's are characterized by the 
presence of five to seven immunoglobulin-like domains [2] in their extracellular section. 
Their kinase domain differs from that of other RTK's by the insertion of a stretch of 70 to 100 
hydrophilic residues in the middle ofthis domain. The receptors currently known to belong to 
class III are: - Platelet-derived growth factor receptor (PDGF-R). PDGF-R exists as a homo- 
or heterodimer of two related chains: alpha and beta [3]. - Macrophage colony stimulating 
factor receptor (CSF-l-R) (also known as the fms oncogene). - Stem cell factor (mast cell 
growth factor) receptor (also known as the kit oncogene). - Vascular endothelial growth 
factor (VEGF) receptors Flt-1 and Flk-l/KDR [4]. - Fl cytokine receptor Flk-2/Flt-3 [5]. - 
The putative receptor Flt-4 [7]. a signature pattern Was developed for this class of RTK's 
which is based on a conserved region in the kinase domain. 

Consensus pattern: G-x-H-x-N-[LIVM]-V-N-L-L-G-A-C-T- 

[ 1] Yarden Y., Ullrich A. Annu. Rev. Biochem. 57:443-478(1988). 
[ 2] Hunkapiller T., Hood L. Adv. Immunol. 44:1-63(1989). 

[ 3] Lee K.-H., Bowen-Pope D.F., Reed R.R. Mol. Cell. Biol. 10:2237-2246(1990). 
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[ 4] Terman B.I., Dougher-Vermazen M., Carrion M.E., Dimitrov D., Armellino D.C., 
Gospodarowicz D., Boehlen P. Biochem. Biophys. Res. Commun. 187:1579-1586(1992). 
[ 5] Lyman S.D., James L., Vanden Bos T., de Vries P., Brasel K., Gliniak B., Hollingsworth 
L.T., Picha K.S., McKenna H.J., Splett R.R. Cell 75:1157-1167a993). 
[ 6] Galland F., Karamysheva A., Febusque M.J., Borg J.P., Rottapel R., Dubreuil P., Rosnet 
O., Birnbaum D. Oncogene 8:1233-1240(1993). 

Receptor tyrosine kinase class V signatures 

A number of growth factors stimulate mitogenesis by interacting with a familyof cell surface 
receptors which possess an intrinsic, ligand-sensitive, protein tyrosine kinase activity [1]. 
These receptor tyrosine kinases (RTK)all share the same topology: an extracellular ligand- 
binding domain, a single transmembrane region and a cytoplasmic kinase domain. However 
they can be classified into at least five groups on the basis of sequence similarities. The 
extracellular domain of class V RTK's consist of a region of about 300amino acids, amongst 
which 16 conserved cysteines probably involved in disulfide bonds; this region is followed 
by two copies of a fibronectin typelll domain. The ligands for these receptors are proteins of 
about 200 to 300residues collectively known as Ephrins. The receptors currently known to 
belong to class V are [2,3 ,E1]: - EPHAl (Eph-1; Esk). - EPHA2 (Eck; Mpk-5; Sek-2). - 
EPHA3 (Etk-1; Hek; Mek4; Tyro4; Rek4; Cek4). - EPHA4 (Sek; HekS; Mpk-3; Cek8). - 
EPHA5 (Ehk-1; Hek7; Bsk; Cek7). - EPHA6 (Ehk-2). - EPHA7 (Ehk-3; Hekll; Mdk-1; 
Ebk). - EPHA8 (Eek). - EPHBl (Eph-2; Elk; Net). - EPHB2 (Eph-3; Hek5; Drt; Erk; Nuk; 
Sek-3; Cek5; Qek5). - EPHB3 (Hek-2; Mdk-5). - EPHB4 (Htk; Mdk-2; Myk-1). - EPHB5 
(Cek9).The EPHA subtype receptors bind to GPI-anchored ephrins while the EPHB subtype 
receptors bind to type-I membrane ephrins. Two signature patterns were developed for this 
class of RTK's, which each include some of the conserved cysteine residues. 

Consensus pattern: F-x-[DN]-x-[GAW]-[GA]-C-[LIVM]-[SA]-[LIVM](2)-[SA]-[LV]- 
[KRHQ]-[LIVA]-x(3)-[KR]-C-[PSAW] [The two Cs are probably involved in disulfide 
bonds] 

Consensus pattern: C-x(2)-[DE]-G-[DEQ]-W-x(2,3)-[PAQ]-[LIVMT]-[GT]-x-C-x-C- x(2)- 
G-[HFY]-[EQ] [The three C's are probably involved in disulfide bonds] 

[ 1] Yarden Y., Ullrich A. Annu. Rev. Biochem. 57:443-478(1988). 
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[ 2] Sajjadi F.G., Pasquale E.B., Subramani S. New Biol. 3:769-778(1991). 

[ 3] Wicks LP., Wilkinson D., Salvaris E., Boyd A.W. Proc. Natl. Acad. Sci. U.S.A. 89:1611- 

1615(1992). 

459. Protein kinase C terminal domain 

460. Plant thionins signature 

Thionins are small, basic, plant proteins generally toxic to animal cells [l].They seem to exert 
their toxic effect at the level of the cell membrane but their exact function is not known. They 
consist of a polypeptide chain of forty five to fifty amino acids with three to four internal 
disulfide bonds. They are found in seeds but also in the cell wall of leaves [2]. Thionins are 
processed from larger precursor proteins [3]. Crambin [4], a hydrophobic plant seed protein, 
also belongs to this family. The pattern to detect this family of proteins includes three of the 

six cysteine residues involved in disulfide bonds, -i- -i- |+ 

+ I II I I xxCCxxxxxxxxxxxCxxxxxxxxxCxxxCxxCxxxxxCxxxxxxxx 

************** 1 1 I ^ ^t(-;t. conserved cysteine involved in a disulfide bond.'*': 

position of the pattern. 

Consensus pattern: C-C-x(5)-R-x(2)-[FY]-x(2)-C [The three C's are involved in disulfide 
bonds] The proteins from the gamma-thionin family are not related to the above proteins and 
are described in a separate section. 

[ 1] Vernon L.P., Evett G.E., Zeikus R.D., Gray W.R. Arch. Biochem. Biophys. 238:18- 
29(1985). 

[ 2] Bohlmann H., Clausen S., Behnke S., Giese H., Hiller C, Reimann-Phillip U., Schrader 

G., Barkholt V., Apel K. EMBO J. 7:1559-1565(1988). 

[ 3] Bohlmann H., Apel K. Mol. Gen. Genet. 207:446-454(1987). 

[ 4] Teeter M.M., Mazer J.A., L'ltalien J.J. Biochemistry 20:5437-5443(1981). 

461. Polyprenyl synthetases signatures 
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A variety of isoprenoid compounds are synthesized by various organisms. For example in 
eukaryotes the isoprenoid biosynthetic pathway is responsible for the synthesis of a variety of 
end products including cholesterol, dolichol, ubiquinone or coenzyme Q. In bacteria this 
pathway leads to the synthesis of isopentenyl tRNA, isoprenoid quinones, and sugar carrier 
lipids. Among the enzymes that participate in that pathway, are a number of polyprenyl 
synthetase enzymes which catalyze a 1'4-condensation between 5 carbon isoprene units. 
Currently the sequence of some of these enzymes is known: - Eukaryotic farnesyl 
pyrophosphate synthetase (FPP synthetase) (EC 2.5.1.1 / EC 2.5.1.10 ^ which catalyzes the 
sequential condensation of isopentenyl pyrophosphate (IPP) with dimethylallyl 
pyrophosphate (DMAPP), and then with the resultant geranyl pyrophosphate to form farnesyl 
pyrophosphate. FPP synthetase is a cytoplasmic dimeric enzyme. - Prokaryotic farnesyl 
pyrophosphate synthetase (gene ispA). - Prokaryotic octaprenyl diphosphate synthase (gene 
ispB). - Prokaryotic heptaprenyl diphosphate synthase (EC 2.5.1.30 ). - Eukaryotic 
geranylgeranyl pyrophosphate synthetase (GGPP synthetase) (EC 2.5.1.1 / EC 2.5.1.10 / EC 
2.5.1.29 ) which catalyzes the sequential addition of the three molecules of IPP onto DMAPP 
to form geranylgeranyl pyrophosphate. In plants GGPP synthase is a chloroplast enzyme 
involved in the biosynthesis of terpenoids; in fungi, such as Neurospora crassa (gene al-3), 
this enzyme is involved in the biosynthesis of carotenoids. - Prokaryotic GGPP synthetase, 
which are involved in the biosynthesis of carotenoids (gene crtE). Such an enzyme is also 
encoded in the cyanelle genome of Cyanophora paradoxa. - Eukaryotic hexaprenyl 
pyrophosphate synthetase, which is involved in the biosynthesis of coenzyme Q and which 
catalyzes the formation of all trans- polyprenyl pyrophosphates generally ranging in length of 
between 6 and 10 isoprene units depending on the species. HP synthetase is a mitochondrial 
membrane-associated enzyme. It has been shown [1 to 5] that all the above enzymes share 
some regions of sequence similarity. Two of these regions are rich in aspartic-acid residues 
and could be involved in the catalytic mechanism and/or the binding of the substrates, 
signature patterns were developed for both regions. Possible additional members of this 
family of proteins are: - Bacillus subtilis spore germination protein C3 (gene gerC3). Both 
proteins are most probably also enzymes involved in isoprenoid metabolism [6]. 



Consensus pattern: [LIVM](2)-x-D-D-x(2,4)-D-x(4)-R-R-[GH]- 

Consensus pattern: [LIVMFY]-G-x(2)-[FYL]-Q-[LIVM]-x-D-D-[LIVMFY]-x-[DNG] 
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[ 1] Ashby M.N., Edwards P.A. J. Biol. Chem. 265:13157-13164(1990). 

[ 2] Fujisaki S., Hara H., Nishimura Y., Horiuchi K., Nishino T. J. Biochem. 108:995- 

1000(1990). 

[ 3] Carattoli A., Romano N., Ballario P., Morelli G., Macino G. J. Biol. Chem. 266:5854- 
5859(1991). 

[ 4] Kuntz M., Roemer S., Suire C, Hugueney P., Weil J.H., Schantz R., Camara B. Plant J. 
2:25-34(1992). 

[ 5] Math S.K., Hearst J.E., Poulter CD. Proc. Natl. Acad. Sci. U.S.A. 89:6761-6764(1992). 
[ 6] Bairoch A. Unpublished observations (1993). 

462. Potato inhibitor I family signature 

The potato inhibitor I family is one of the numerous families of serine proteinase inhibitors. 
Members of this protein family are found in plants; in the seeds of barley or beans [1,2,3], 
and in potato or tomato leaves where they accumulate in response to mechanical damage 
[4,5]. An inhibitor belonging to this family is also found in leech [6]. It is interesting to note 
that, currently, this is the only proteinase inhibitor family to be found both inplant and animal 
kingdoms. Structurally these inhibitors are small (60 to 90 residues) and in contrast with 
other families of protease inhibitors, they lack disulfide bonds. They have a single inhibitory 
site. The consensus pattern includes three out of the four residues conserved in all members 
of this family and is located in the N-terminal half. 

Consensus pattern: [FYW]-P-[EQH]-[LIV](2)-G-x(2)-[STAGV]-x(2)-A- Barley subtilisin- 
chymotrypsin inhibitor-2b has Glu instead of Gly. There is a trypsin inhibitor from the 
cucurbitaceae Momordica charantia [7], which is said to belong to the potato inhibitor I 
family but which shows only a very weak similarity with the other members of this family. 

[ 1] Svendsen I., Hejgaard J., Chavan J.K. Carlsberg Res. Commun. 49:493-502(1984). 

[ 2] Svendsen I., Boisen S., Hejgaard J. Carlsberg Res. Commun. 47:45-53(1982). 

[ 3] Nozawa H., Yamagata H., Aizono Y., Yoshikawa M., Iwasaki T. J. Biochem. 106:1003- 

1008(1989). 

[ 4] Cleveland T.E., Thornburg R.W., Ryan C.A. Plant Mol. Biol. 8:199-207(1987). 
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[ 5] Lee J.S., Brown W.E., Graham J.S., Pearce G., Fox E.A., Dreher T.W., Ahern K.G., 

Pearson G.D., Ryan C.A. Proc. Natl. Acad. Sci. U.S.A. 83:7277-7281(1986). 

[ 6] Seemuller U., Eulitz M., Fritz H., Strobl A. Hoppe-Seyler's Z. Physiol. Chem. 361:1841- 

1846(1980). 

[ 7] Zeng F.-Y., Qian R.-Q., Wang Y. FEBS Lett. 234:35-38(1988). 
463. (pp binding) Phosphopantetheine attachment site 

Phosphopantetheine (or pantetheine 4' phosphate) is the prosthetic group of acyl carrier 
proteins (ACP) in some multienzyme complexes where it serves as a 'swinging arm' for the 
attachment of activated fatty acid and amino-acid groups [1], Phosphopantetheine is attached 
to a serine residue in these proteins [2]. ACP proteins or domains have been found in various 
enzyme systems which are listed below (references are only provided for recently determined 
sequences). - Fatty acid synthetase (FAS), which catalyzes the formation of long-chain fatty 
acids from acetyl-CoA, malonyl-CoA and NADPH. Bacterial and plant chloroplast FAS are 
composed of eight separate subunits which correspond to the different enzymatic activities; 
ACP is one of these polypeptides. Fungal FAS consists of two multifunctional proteins, 
FASl and FAS2; the ACP domain is located in the N-terrainal section of FAS2. Vertebrate 
FAS consists of a single multifunctional enzyme; the ACP domain is located between the 
beta-ketoacyl reductase domain and the C-terminal thioesterase domain [3]. - Polyketide 
antibiotics synthase enzyme systems. Polyketides are secondary metabolites produced from 
simple fatty acids, by microorganisms and plants. ACP is one of the polypeptidic components 
involved in the biosynthesis of Streptomyces polyketide antibiotics actinorhodin, curamycin, 
granatacin, monensin, oxy tetracycline and tetracenomycin C. - Bacillus subtil is putative 
polyketide synthases pksK, pksL and pksM which respectively contain three, five and one 
ACP domains. - The multifunctional 6-methysalicylic acid synthase (MSAS) from 
Penicillium patulum. This is a multifunctional enzyme involved in the biosynthesis of a 
polyketide antibiotic and which contains an ACP domain in the C-terminal extremity. - 
Multifunctional mycocerosic acid synthase (gene mas) from Mycobacterium bovis. - 
Gramicidin S synthetase I (gene grsA) from Bacillus brevis. This enzyme catalyzes the first 
step in the biosynthesis of the cyclic antibiotic gramicidin S. - Tyrocidine synthetase I (gene 
tycA) from Bacillus brevis. The reaction carried out by tycA is identical to that catalyzed by 
grsA - Gramicidin S synthetase II (gene grsB) from Bacillus brevis. This enzyme is a 
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multifunctional protein that activates and polymerizes proline, valine, ornithine and leucine. 
GrsB contains four ACP domains. - Erythronolide synthase proteins 1, 2 and 3 from 
Saccharopolyspora erythraea which is involved in the biosynthesis of the polyketide 
antibiotic erythromicin. Each of these proteins contain two ACP domains. - Conidial green 
pigment synthase from Aspergillus nidulans. - ACV synthetase from various fungi. This 
enzyme catalyzes the first step in the biosynthesis of penicillin and cephalosporin. It contains 
three ACP domains. - Enterobactin synthetase component F (gene entF) from Escherichia 
coli. This enzyme is involved in the ATP-dependent activation of serine during enterobactin 
(enterochelin) biosynthesis. - Cyclic peptide antibiotic surfactin synthase subunits 1, 2 and 3 
from Bacillus subtilis. Subunits 1 and 2 contains three related domains vv^hile subunit 3 only 
contains a single domain. - HC-toxin synthetase (gene HTSl) from Cochliobolus carbonum. 
This enzyme synthesizes HC-toxin, a cyclic tetrapeptide. HTSl contains four ACP domains. - 
Fungal mitochondrial ACP [9], which is part of the respiratory chain NADH dehydrogenase 
(complex I). - Rhizobium nodulation protein nodF, which probably acts as an ACP in the 
synthesis of the nodulation Nod factor fatty acyl chain.The sequence around the 
phosphopantetheine attachment site is conserved in all these proteins and can be used as a 
signature pattern. A profile was also developed that spans the complete ACP-like domain. 

Consensus pattern: [DEQGSTALMKRH]-[LIVMFYSTAC]-[GNQ]-[LIVMFYAG]- 
[DNEKHS]-S- [LIVMST]-{PCFY}-[STAGCPQLIVMF]-[LIVMATN]- 
[DENQGTAKRHLM]- [LIVMWSTA]-[LIVGSTACR]-x(2)-[LIVMFA] [S is the pantetheine 
attachment site] 

[ 1] Concise Encyclopedia Biochemistry, Second Edition, Walter de Gruyter, Berlin New- 
York (1988). 

[ 2] Pugh E.L., Wakil S.J. J. Biol. Chem. 240:4727-4733(1965). 

[ 3] Witkowski A., Rangan V.S., Randhawa Z.I., Amy CM., Smith S. Eur. J. Biochem. 
198:571-579(1991). 

[ 6] Scotti C, Piatti M., Cuzzoni A., Perani P., Tognoni A., Grandi G., Galizzi A., Albertini 
A.M. Gene 130:65-71(1993). 

[ 9] Sackmann U., Zensen R., Rohlen D., Jahnke U., Weiss H. Eur. J. Biochem. 200:463- 
469(1991). 
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464. (Prenyltrans) Terpene synthases signature 

The following enzymes catalyze mechanistically related reactions which involvethe highly 
complex cyclic rearrangement of squalene or its 2,3 oxide: - Lanosterol synthase (EC 

5.4.99.7 ) (oxidosqualene-lanosterol cyclase), which catalyzes the cyclization of (S)-2,3- 
epoxysqualene to lanosterol, the initial precursor of cholesterol, steroid hormones and 
vitamin D in vertebrates and of ergosterol in fungi (gene ERG7). - Cycloartenol synthase (EC 

5.4.99.8 ) (2,3-epoxysqualene~cycloartenol cyclase), a plant enzyme that catalyzes the 
cyclization of (S)-2,3- epoxysqualene to cycloartenol. - Hopene synthase (EC 5.4.99.-) 
(squalene-hopene cyclase), a bacterial enzyme that catalyzes the cyclization of squalene into 
hopene, a key step in hopanoid (triterpenoid) metabolism.These enzymes are evolutionary 
related [1] proteins of about 70 to 85 Kd. As a signature pattern, a highly conserved region 
was selected which is rich in aromatic residues and which is located in the C-terminal section. 

Consensus pattern: [DE]-G-S-W-x-G-x-W-[GA]-[LIVM]-x-[FY]-x-Y-[GA] 

[ 1] Corey E.J., Matsuda S.P.T., Bartel B. Proc. Natl. Acad. Sci. U.S.A. 90:11628- 
11632(1993). 



465. Prion protein signatures 

Prion protein (PrP) [1,2,3] is a small glycoprotein found in high quantity in the brains of 
humans or animals infected with a number of degenerative neurological diseases such as 
Kuru, Creutzfeldt-Jacob disease (CJD), scrapie or bovine spongiform encephalopathy (BSE). 
PrP is encoded in the host genome and expressed both in normal and infected cells. It has a 
tendency to aggregate yielding polymers called rods. Structurally, PrP is a protein consisting 
of a signal peptide, followed by an N-terminal domain that contains tandem repeats of a short 
motif (PHGGGWGQin mammals, PHNPGY in chicken), itself followed by a highly 
conserved domain lly comes a C-terminal hydrophobic domain post-translationally removed 
when PrP is attachedto the extracellular side of the cell membrane by a GPI-anchor. The 

structureof PrP is shown in the following schematic representation: +— + +- 

****** **** + + |sig| Tandem repeats | C C S| | +— + +— 

1 1 — 1+ + + + j GPI'C: conserved cysteine involved in a 
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disulfide bond.'*': position of the patterns. As signature pattern for PrP, a perfectly conserved 
alanine- and glycine-rich region of 16 residues was selected as well as a region centered on 
the second cysteine involved in the disulfide bond. 

Consensus pattern: A-G-A-A-A-A-G-A-V-V-G-G-L-G-G-Y- 

Consensus pattern: E-x-[ED]-x-K-[LIVM](2)-x-[Kll]-[LIVM](2)-x-[QE]-M-C-x(2)- Q-Y [C 
is involved in a disulfide bond] 

[ 1] Stahl N., Prusiner S.B. FASEB J. 5:2799-2807(1991). 

[ 2] Brunori M., Chiara Silvestrini M., Pocchiari M. Trends Biochem. Sci. 13:309-313(1988). 
[ 3] Prusiner S.B. Annu. Rev. Microbiol. 43:345-374(1989). 

466. Cyclophilin-type peptidyl-prolyl cis-trans isomerase signature and profile (pro 
isomerase) 

Cyclophilin [1] is the major high-affinity binding protein in vertebrates for the 
immunosuppressive drug cyclosporin A (CSA). It exhibits a peptidyl- prolyl cis-trans 
isomerase activity (EC 5.2.1.8) (PPIase or rotamase). PPIase is an enzyme that accelerates 
protein folding by catalyzing the cis-transisomerization of proline imidic peptide bonds in 
oligopeptides [2]. It is probable that CSA mediates some of its effects via an inhibitory action 
on PPIase. Cyclophilin is a cytosolic protein which belongs to a family [3,4,5]that also 
includes the following isozymes: - Cyclophilin B (or S-cyclophilin), a PPIase which is 
retained in an endoplasmic reticulum compartment. - Cyclophilin C, a cytoplasmic PPiase. - 
Mitochondrial matrix cyclophilin (cyp3). - A PPIase which seems specific for the folding of 
rhodopsin and is an integral membrane protein anchored by a C-terminal transmembrane 
region. This protein was first characterized in Drosophila (gene ninaA). - Bacterial 
periplasmic PPiase (gene ppiA). - Bacterial cytosolic PPiase (gene ppiB). - Natural-killer cell 
cyclophilin-related protein. This large protein (about 160 Kd) is a component of a putative 
tumor-recognition complex involved in the function of NK cells. It contains a cyclophilin- 
type PPiase domain. - Mammalian nucleoporin Nup358 [6], a nuclear pore complex protein 
of 358 Kd that contains a C-terminal cyclophilin-type PPiase domain. - Yeast hypothetical 
protein YJR032w. - Fission yeast hypothetical protein SpAC21E11.05c. - Caenorhabditis 
elegans hypothetical protein T27Dl.l.The sequences of the different forms of cyclophilin- 
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type PPIases are well conserved. As a signature pattern, a conserved region was selected in 
the central part of these enzymes. 

Consensus pattern: [FY]-x(2)-[STCNLV]-x-F-H-[RH]-[LIVMN]-[LIVM]-x(2)-F- [LIVM]-x 
Q-[AG]-G- FKBP's, a family of proteins that bind the immunosuppressive drug FK506, are 
also PPIases, but their sequence is not at all related to that of cyclophilin. 

[ 1] Stamnes M.A., Rutherford S.L., Zuker C.S. Trends Cell Biol. 2:272-276(1992). 

[ 2] Fischer G., Schmid F.X. Biochemistry 29:2205-2212(1990). 

[ 3] Trandinh C.C., Pao G.M., Saier M.H. Jr. FASEB J. 6:3410-3420(1992). 

[ 4] Galat A. Eur. J. Biochem. 216:689-707(1993). 

[ 5] Hacker J., Fischer G. Mol. Microbiol. 10:445456(1993). 

[ 6] Wu J., Matunis M.J., Kraemer D., Blobel G., Coutavas E. J. Biol. Chem. 270:14209- 
14213(1995). 

467. Profilin signature 

Profilin [1,2] is a small eukaryotic protein that binds to monomeric actin(G-actin) in a 1:1 
ratio thus preventing the polymerization of actin into filaments (F-actin). It can also, in 
certain circumstance promotes actin polymerization. Profilin also binds to 
polyphosphoinositides such as PIP2.0verall sequence similarity among profilin from 
organisms which belong to different phyla (ranging from fungi to mammals) is low, but the 
N-terminal region is relatively well conserved. That region is thought to be involved inthe 
binding to actin. The signature pattern for profilin is based on conserved residues at the N- 
terminal extremity .A protein structurally similar to profilin is present in the genome of 
variola and vaccinia viruses (gene A42R). 

Consensus pattern: <x(0, 1)-[STA] -x(0,l )- W-[DENQH] -x-[ YI] -x- [DEQ] 

[ 1] Haarer B.K., Brown S.S. Cell Motil. Cytoskeleton 17:71-74(1990). 
[ 2] Sohn R.H., Goldschmidt-Clermont P. BioEssays 16:465-472(1994). 
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468. Protamine PI signature 

Protamines are small, highly basic proteins, that substitute for histones in sperm chromatin 
during the haploid phase of spermatogenesis. They pack sperm DNA into a highly 
condensed, stable and inactive complex. There are two different types of mammalian 
protamine, called PI and P2. PI has been found in all species studied, while P2 is sometimes 
absent. There seems to be a single type of avian protamine whose sequence is closely related 
to that of mammalian PI [l].As a signature for this family of proteins, a conserved region 
was selected at the N-terminal extremity of the sequence. 

Consensus pattern: [AV]-R-[NFY]-R-x(2,3)-[ST]-x-S-x-S- 

[ 1] Oliva R., Goren R., Dixon G.H. J. Biol. Chem. 264:17627-17630(1989). 

469. Sperm histone P2 (protamine P2) 

This protein also known as protamine P2 can substitute for histones in the chromatin of 
sperm. The alignment contains both the sequence of the mature P2 protein and its propeptide. 

470. Proteasome A-type subunits signature 

The proteasome (or macropain) (EC 3.4.99.46 ) [1 to 5,E1] is an eukaryotic and 
archaebacterial multicatalytic proteinase complex that seems to be involved inan 
ATP/ubiquitin-dependent nonlysosomal proteolytic pathway. In eukaryotes the proteasome is 
composed of about 28 distinct subunits which form a highly ordered ring-shaped structure 
(20S ring) of about 700 Kd. Most proteasome subunits can be classified, on the basis on 
sequence similarities into two groups, A and B. Subunits that belong to the A-type group are 
proteins of from 210 to 290 amino acids that share a number of conserved sequence regions. 
Subunits that are known to belong to this family are listed below. - Vertebrate subunits C2 
(nu), C3, C8, C9, iota and zeta. - Drosophila PROS-25, PROS-28.1, PROS-29 and PROS-35. 
- Yeast CI (PRSl), C5 (PRS3), C7-alpha (Y8) (PRS2), Y7, Y13, PRE5, PRE6 and PUP2. - 
Arabidopsis thaliana subunits alpha and PSM30. - Thermoplasma acidophilum alpha-subunit. 
In this archaebacteria the proteasome is composed of only two different subunits.As a 
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signature pattern for proteasome A-type subunits the best conserved region was selected, 
which is located in the N-terminal part of these proteins. 

Consensus pattern: [FY]-x(4)-[STNV]-x-[FYW]-S-P-x-G-[RKH]-x(2)-Q-[LIVM]-[DE]- Y- 
[SAD]-x(2)-[SAG]-. These proteins belong to family Tl in the classification of peptidases 
[6,E2]. 

[ 1] Rivett A.J. Biochem. J. 291:1-10(1993). 

[ 2] Rivett AJ. Arch. Biochem. Biophys. 268:1-8(1989). 

[ 3] Goldberg A.L., Rock K.L Nature 357:375-379(1992). 

[ 4] Wilk S. Enzyme Protein 47:187-188(1993). 

[ 5] Hilt W., Wolf D.H. Trends Biochem. Sci. 21:96-102(1996). 

[ 6] Rawlings N.D., Barrett A.J. Meth. Enzymol. 244:19-61(1994). 

Proteasome B-type subunits signature 

The proteasome (or macropain) (EC 3.4.99.46 ) [1 to 5,E1] is an eukaryotic and 
archaebacterial multicatalytic proteinase complex that seems to be involved in an 
ATP/ubiquitin-dependent nonlysosomal proteolytic pathway. In eukaryotes the proteasome is 
composed of about 28 distinct subunits which form a highly ordered ring-shaped structure 
(20S ring) of about 700 Kd. Most proteasome subunits can be classified, on the basis on 
sequence similarities into two groups, A and B. Subunits that belong to the B-type group are 
proteins of from 190 to 290 amino acids that share a number of conserved sequence regions. 
Subunits that are known to belong to this family are listed below. - Vertebrate subunits C5, 
beta, delta, epsilon, theta (ClO-II), LMP2/RING12, C13 (LMP7/RING10), C7-I and MECL- 
1. - Yeast PREl, PRE2 (PRGl), PRE3, PRE4, PRS3, PUPl and PUP3. - Drosophila 
L(3)73AI. - Fission yeast ptsl. - Thermoplasma acidophilum beta-subunit. In this 
archaebacteria the proteasome is composed of only two different subunits. As a signature 
pattern for proteasome B-type subunits the best conserved region was selected, which is 
located in the N-terminal part of these proteins. 

Consensus pattern : [LIVMA] - [GS A] - [LIVMF] -x-[FYLVGAC] -x(2)-[GS ACFY] - 
[LIVMSTAC](3)-[GAC]-[GSTACV]-[DES]-x(15)-[RK]-x(12,13)-G-x(2)-[GSTA]-D-. These 
proteins belong to family Tl in the classification of peptidases [6,E2]. 
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[ 1] Rivett AJ. Biochem. J. 291:1-10(1993). 

[ 2] Rivett AJ. Arch. Biochem. Biophys. 268:1-8(1989). 

[ 3] Goldberg A.L., Rock K.L Nature 357:375-379(1992). 

[ 4] Wilk S. Enzyme Protein 47:187-188(1993). 

[ 5] Hilt W., Wolf D.H. Trends Biochem. Sci. 21:96-102(1996). 

[ 6] Rawlings N.D., Barrett A.J. Meth. Enzymol. 244:19-61(1994). 

471. (pyr redox) Pyridine nucleotide-disulphide oxidoreductases class-I active site 
The pyridine nucleotide-disulphide oxidoreductases are FAD flavoproteins which contains a 
pair of redox-active cysteines involved in the transfer of reducing equivalents from the FAD 
cofactor to the substrate. On the basis of sequence and structural similarities [1] these 
enzymes can be classified into two categories. The first category groups together the 
following enzymes [2 to 6]: - Glutathione reductase (EC 1.6.4.2 ) (GR). - Higher eukaryotes 
thioredoxin reductase (EC 1.6.4.5 ). - Trypanothione reductase (EC 1.6.4.8 ). - Lipoamide 
dehydrogenase (EC 1.8.1.4 ), the E3 component of alpha-ketoacid dehydrogenase complexes. 
- Mercuric reductase (EC 1.16.1.1 ).The sequence around the two cysteines involved in the 
redox-active disulfide bond is conserved and can be used as a signature pattern. 

Consensus pattern: G-G-x-C-[LIVA]-x(2)-G-C-[LIVM]-P [The two C's form the active site 
disulfide bond]. In positions 6 and 7 of the pattern all known sequences have Asn-(Val/ He) 
with the exception of GR from plant chloroplasts and from cyanobacteria which have Ile-Arg 
[7]. 

[ 1] Kurlyan J., Krishna T.S.R., Wong L., Guenther B., Pahler A., Williams C.H. Jr., Model 
P. Nature 352:172-174(1991). 

[ 2] Rice D.W., Schulz G.E., Guest J.R. J. Mol. Biol. 174:483-496(1984). 
[ 3] Brown N.L. Trends Biochem. Sci. 10:400-402(1985). 

[ 4] Carothers D.J., Pons G., Patel M.S. Arch. Biochem. Biophys. 268:409-425(1989). 
[ 5] Walsh C.T., Bradley M., Nadeau K. Trends Biochem. Sci. 16:305-309(1991). 
[ 6] Gasdaska P.Y., Gasdaska J.R., Cochran S., Fowls G. FEBS Lett. 373:5-9(1995). 
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[ 7] Creissen G., Edwards E.A., Enard C, Wellburn A., Mullineaux P. Plant J. 2:129- 
131(1991). 

472. (pyridoxal deC) DDC / GAD / HDC / TyrDC pyridoxal-phosphate attachment site 
(pyridoxal deC) 

Three different enzymes - all pyridoxal-dependent decarboxylases - seem to share regions of 
sequence similarity [1,2,3,4], especially in the vicinity of the lysine residue which serves as 
the attachment site for the pyridoxal-phosphate (PLP) group. These enzymes are: - Glutamate 
decarboxylase (EC 4.1.1.15) (GAD). Catalyzes the decarboxylation of glutamate into the 
neurotransmitter GABA (4-aminobutanoate). - Histidine decarboxylase (EC 4.1.1.22 ) (HDC). 
Catalyzes the decarboxylation of histidine to histamine. There are two completely unrelated 
types of HDC: those that use PLP as a cofactor (found in Gram-negative bacteria and 
mammals), and those that contain a covalently bound pyruvoyl residue (found in Gram- 
positive bacteria). - Aromatic-L-amino-acid decarboxylase (EC 4.1.1.28 ) (DDC), also known 
as L-dopa decarboxylase or tryptophan decarboxylase. DDC catalyzes the decarboxylation of 
tryptophan to tryptamine. It also acts on 5-hydroxy- tryptophan and dihydroxyphenylalanine 
(L-dopa). - Tyrosine decarboxylase (EC 4.1.1.25 ) (TyrDC) which converts tyrosine into 
tyramine, a precursor of isoquinoline alkaloids and various amides.These enzymes are 
collectively known as group II decarboxylases [3,4]. 

Consensus pattern: S-[LIVMFYW]-x(5)-K-[LIVMFYWG](2)-x(3)-[LIVMFYW]-x-[CA]- 
x(2)-[LIVMFYWQ]-x(2)-[RK] [K is the pyridoxal-P attachment site] 

[ 1] Jackson F.R. J. Mol. Evol. 31:325-329(1990). 

[ 2] Joseph D.R., Sullivan P., Wang Y.-M., Kozak C, Fenstermacher D.A., Behrendsen M.E., 

Zahnow C.A. Proc. Natl. Acad. Sci. U.S.A. 87:733-737(1990). 

[ 3] Sandmeier E., Hale T.L, Christen P. Eur. J. Biochem. 221:997-1002(1994). 

[ 4] Ishii S., Mizugichi H., Nishino J., Hayashi H., Kagamiyama H. J. Biochem. 120:369- 

376(1996). 

473. Regulator of chromosome condensation (RCCl) signatures (RCCl) 
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The regulator of chromosome condensation (RCCl) [1] is a eukaryotic protein which binds to 
chromatin and interacts with ran, a nuclear GTP -binding protein, to promote the loss of 
bound GDP and the uptake offresh GTP, thus acting as a guanine-nucleotide dissociation 
stimulator (GDS)[2]. The interaction of RCCl with ran probably plays an important role in 
the regulation of gene expression. RCCl, known as PRP20 or SRMl in yeast, piml in fission 
yeast and BJl in Drosophila, is a protein that contains seven tandem repeats of a domain of 
about 50 to 60 amino acids. As shown in the following schematic representation, the repeats 
make up the major part of the length of the protein. Outside the repeat region, there is just a 
small N-terminal domain of about 40 to 50 residues and, in the Drosophila protein only, a C- 

terminal domain of about 130 residues. + — + + + + + + + 

+ |N-t.|Rpt. 1 |Rpt. 2 |Rpt. 3 |Rpt. 4 |Rpt. 5 |Rpt. 6 |Rpt. 7 | C-terminal | +— -+— 

— + + + 1- + + + + In Drosophila two signature 

patterns for RCCl were developed. The first is found in the N- terminal part of the second 
repeat; this is the most conserved part of RCCl. The second is derived from conserved 
positions in the C-terminal part of each repeat and detects up to five copies of the repeated 
domain. The RCCl-type of repeat is also found in the X-linked retinitis pigmentosa GTPase 
regulator [3]. 

Consensus pattern: G-x-N-D-x(2)-[AV]-L-G-R-x-T- 

Consensus pattern: [LIVMFA]-[STAGC](2)-G-x(2)-H-[STAGLI]-[LIVMFA]-x-[LIVM]- 

[ 1] Dasso M. Trends Biochem. Sci. 18:96-101(1993). 

[ 2] Boguski M.S., McCormick F. Nature 366:643-654(1993). 

[ 3] Roepman R., Van Duijnhoven G., Rosenberg T., Pinckers A.J.L.G., Bleeker- 

Wagemakers L.M., Bergen A.A.B., Post J., Beck A., Reinhardt R., Ropers H.-H., Cremers F., 

Berger W. Hum. Mol. Genet. 5:1035-1041(1996). 

474. RNA 3'-terminal phosphate cyclase signature (RCT) 

RNA 3'-terminal phosphate cyclase (EC 6.5.1.4 ) [1,2] catalyzes the conversion of 3'- 
phosphate to a 2',3'-cyclic phosphodiester at the end of RNA. The biological role of this 
enzyme is unknown but it is likely to function in some aspects of cellular RNA processing. 
The reaction catalyzed by the enzyme occurs in three steps: 1) adenylation of the enzyme by 
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ATP; 2) the enzyme acts on RNA-3'terminal phosphate to produce RNA-3'terminal 
diphosphate adenylate; 3) Release of AMP and cyclisation by a non catalytic nucleophilic 
attack by the adjacent 2'hydroxyl on the phosphorus in the diester linkage. This enzyme, 
which has been characterized in human (where there seems to be at least three isozymes) and 
Escherichia coli (gene rtCA), seems to be taxonomically widespread. It is found in insects, 
plants, fungi (gene RTCl inyeast) and in archeabacteria. RNA cyclase is a protein of from 36 
to 42 Kd. The best conserved region, which is used as a signature pattern, is a glycine-rich 
stretch of residues located in the central part of the sequence and which is reminiscent of 
various ATP, GTPor AMP glycine-rich loops. In this context, the conserved Arg (His in the 
E.coli enzyme) could be the AMP-binding residue. 

Consensus pattern: [RH]-G-x(2)-P-x-G(3)-x-[LIV]- 

[ 1] Genschik P., Billy E., Swianiewicz M., Filipowicz W. EMBO J. 16:2955-2967(1997). 
[ 2] Filipowicz W., Vincente O. Meth. Bnzymol. 181:499-510(1990). 

475. REV protein (anti-repression trans-activator protein) 

476. Prokaryotic-type class I peptide chain release factors signature (RF-1) 

Peptide chain release factors (RFs) are required for the termination of protein biosynthesis 
[1]. At present two classes of RFs can be distinguished. Class I RFs bind to ribosomes that 
have encountered a stop codon at their decoding site and induce release of the nascent 
polypeptide. Class 11 RFs are GTP-binding proteins that interact with class I RFs and enhance 
class I RF activity. In prokaryotes there are two class I RFs that act in a codon specific 
manner[2]: RF-1 (gene prfA) mediates UAA and UAG-dependent termination while RF- 
2(gene prfB) mediates UAA and UGA-dependent termination. RF-1 and RF-2 are structurally 
and evolutionary related proteins which have been shown [3] to make up a family that also 
contains the following proteins: - Fungal MRFl, a mitochondrial RF (m-RF) which 
recognizes the UAA and UAG codons. - Escherichia coli RF-H, a protein of unknown 
function. - Escherichia coli hypothetical protein yaeJ and a close Pseudomonas putida 
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homolog. A highly conserved region located in the central part of the 40 to 45 Kd RF-1/2 and 
m-RF and in the N-terminal of the 15 to 16Kd RF-H and yaeJ is used as a signature pattern. 

Consensus pattern: [AR]-[STA]-x-G-x-G-G-Q-[HNGCS]-V-N-x(3)-[ST]-A-[IV] 
Note that prokaryotic-type class I RFs display no significant sequence similarity to 
prokaryotic-type class II which belong to the family of GTP -binding elongation factors nor to 
eukaryotic class I or class II RFs. 

[ 1] Tate W.P. , Poole E.S., Mannering S.M. Prog. Nucleic Acids. Res. Mol. Biol. 52:293- 
335(1996). 

[ 2] Craigen W.J., Lee C.C., Caskey C.T. Mol. Microbiol. 4:861-865(1990). 
[ 3] Pel H.J., Rep M., Grivell L.A. Nucleic Acids Res. 20:4423-4428(1992). 

477. RIO1/ZK632.3/MJ0444 family signature 

The following uncharacterized proteins are evolutionary related [1]: - Yeast protein RIOl. - 
Caenorhabditis elegans hypothetical protein ZK632.3. - Methanococcus jannaschii 
hypothetical protein MJ0444. - Thermoplasma acidophilum hypothetical protein if rpoA2 
3'region.The eukaryotic members of this family are proteins of about 55 to 60 Kd, while the 
archebacterial ones are half that size. The central part of these proteins is highly conserved. 
The best conserved region is used as a signature pattern. 

Consensus pattern: [LIVM]-V-H-[GA]-D-L-S-E-[FY]-N-x-[LIVM] 

[ 1] Bairoch A. Unpublished observations (1997). 

478. (RIP) Shiga/ricin ribosomal inactivating toxins active site signature. A number of 
bacterial and plant toxins act by inhibiting protein synthesis in eukaryotic cells. The toxins of 
the Shiga and ricin family inactivate 60S ribosomal subunits by an N-glycosidic cleavage 
which releases a specific adenine base from the sugar-phosphate backbone of 288 rRNA 
[1,2,3]. The toxins which are known to function in this manner are: - Shiga toxin from 
Shigella dysenteriae [4]. This toxin is composed of one copy of an enzymatically active A 
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subunit and five copies of a B subunit responsible for binding the toxin complex to specific 
receptors on the target cell surface. - Shiga-like toxins (SLT) are a group of Escherichia coli 
toxins very similar in their structure and properties to Shiga toxin. The sequence of two types 
of these toxins, SLT-1 [5] and SLT-2 [6], is known. - Ricin, a potent toxin from castor bean 
seeds. Ricin consists of two glycosylated chains linked by a disulfide bond. The A chain is 
enzymatically active. The B chain is a lectin with a binding preference for galactosides. Both 
chains are encoded by a single polypeptidic precursor. Ricin is classified as a type-II 
ribosome-inactivating protein (RIP); other members of this family are agglutinin, also from 
castor bean, and abrin from the seeds of the bean Abrus precatorius [7]. - Single chain 
ribosome-inactivating proteins (type -I RIP) from plants. Examples of such proteins are: 
barley protein synthesis inhibitors I and 11, mongolian snake-gourd trichosanthin, sponge 
gourd luffin-A and -B, garden four-o'clock MAP, common pokeberry PAP-S and soapwort 
saporin-6 [7]. All these toxins are structurally related. A conserved glutamic residue has been 
implicated [8] in the catalytic mechanism; it is located near a conserved arginine which also 
plays a role in catalysis [9]. The signature that has been developed for these proteins includes 
these catalytic residues. 

Consensus pattern: [LIVMA]-x-[LIVMSTA](2)-x-E-[SAGV]-[STAL]-R-[FY]-[RKNQS]-x- 
[LIVM]-[EQS]-x(2)-[LIVMF] [E and R are active site residues] - 

[ 1] Endo Y., Tsurugi K., Takeda Y., Ogasawara T., Igarashi K. Eur. J. Biochem. 171:45- 
50(1988).[ 2] May M.J., Hartley M.R., Roberts L.M., Krieg P.A., Osborn R.W., Lord J.M. 
EMBO J. 8:301-308(1989).[ 3] Funatsu G., Islam M.R., Minami Y., Sung-Sil K., Kimura M. 
Biochimie 73:1157-1 161(1991).[ 4] Strockbine N.A., Jackson M.P., Sung L.M., Holmes 
R.K., O'Brien A.D. J. Bacteriol. 170:1116-1122(1988).[ 5] Calderwood S.B., Auclair F., 
Donohue-Rolfe A., Keusch G.T., Mekalanos J.J. Proc. Natl. Acad. Sci. U.S.A. 84:4364- 
4368(1987).[ 6] Jackson M.P., Neill R.J., O'Brien A.D., Holmes R.K., Newland J.W. FEMS 
Microbiol. Lett. 44: 109-1 14(1987).[ 7] Barbieri L., Battelli M.G., Stirpe F. Biochim. 
Biophys. Acta 1154:237-282(1993).[ 8] Hovde C.J., Calderwood S.B., Mekalanos J.J., 
Collier R.J. Proc. Natl. Acad. Sci. U.S.A. 85:2568-2572(1988). [ 9] Monzingo A.F., Collins 
E.J., Ernst S.R., Irvin J.D., Robertus J.D. J. Mol. Biol. 233:705-715(1993). 
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479. Bacterial RNA polymerase, alpha chain (RNA pol A bac) 

Members of this family include alpha subunit from eubacteria and alpha subunits from 
chloroplasts. The alpha subunit of RNA polymerase consists of two independently folded 
domains, referred to as amino-terminal and carboxyl terminal domains. The amino terminal 
domain is involved in the interaction with the other subunits of the RNA polymerase. The 
carboxyl-terminal domain interacts with the DNA and activators. The amino acid sequence of 
the alpha subunit is conserved in prokaryotic and chloroplast RNA polymerases. There are 
three regions of particularly strong conservation, two in the amino-terminal and one in the 
carboxyl-Comment: terminal [3], 

[1] Zhang G, Darst SA; Science 1998;281:262-266. [2] Jeon YH, Negishi T, Shirakawa M, 
Yamazaki T, Fujita N, Ishihama A, Kyogoku Y; Science 1995;270:1495-1497. [3] Ebright 
RH, Busby S; Curr Opin Genet Dev 1995;5:197-203. [4] Murakami K, Kimura M, Owens JT, 
Meares CF, Ishihama A; Proc Natl Acad Sci USA 1997;94:1709-1714. 

480. RNA polymerase beta subunit (RNA pol B) 

RNA polymerases catalyse the DNA dependent polymerisation of RNA. Prokaryotes contain 
a single RNA polymerase compared to three in eukaryotes (not including mitochondrial and 
chloroplast polymerases). Each RNA polymerase complex contains two related members of 
this family, in each case they are the two largest subunits. 

[1] Falkenburg D, Dworniczak B, Faust DM, Bautz EK; J Mol Biol 1987;195:929-937. 

481. RNA polymerases H / 23 Kd subunits signature 

In eukaryotes, there are three different forms of DNA-dependent RNA polymerases (EC 
2.1.1.6 ) transcribing different sets of genes. Each class of RNA polymerase is an assemblage 
of ten to twelve different polypeptides. In archaebacteria, there is generally a single form of 
RNA polymerase which also consist of an oligomeric assemblage of 10 to 13 polypeptides. 
Archaebacterial subunit H (gene rpoH) [1,2] is a small protein of about 8.5 tolO Kd, it is 
evolutionary related to the C-terminal part of a 23 Kd component shared by all three forms of 
eukaryotic RNA polymerases (gene RPB5 in yeast and POLR2E in mammals).As a signature 
pattern a conserved region was selected which is located at theN-terminal extremity of 
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subunit H; this region contains two histidines that could play a role in the binding of a metal 
ion. 

Consensus pattern: H-[NEI]-[LIVM]-V-P-x-H-x(2)-[LIVM]-x(2)-[DE] 

[ 1] Klenk H.-P., Palm P., Lottspeich F., Zillig W. Proc. Natl. Acad. Sci. U.S.A. 89:407- 
410(1992). 

[ 2] Thiru A., Hodach M., Eloranta J.J., Kostourou V., Weinzierl R.O., Matthews S.; J. Mol. 
Biol. 287:753-760(1999). 

482. RNA polymerases K / 14 to 18 Kd subunits signature 

In eukaryotes, there are three different forms of DNA-dependent RNApolymerases (EC 
2.7.7.6 ) transcribing different sets of genes. Each class of RNA polymerase is an assemblage 
often to twelve different polypeptides. In archaebacteria, there is generally a single form of 
RNA polymerase which also consist of an oligomeric assemblage of 10 to 13 polypeptides. A 
component of 14 to 18 Kd shared by all three forms of eukaryotic RNA polymerases and 
which has been sequenced in budding yeast (gene RPB6 orRP026), in fission yeast (gene 
rpb6 or rpol5), in human and in African swine fever virus [1] is evolutionary related [2] to 
archaebacterial subunit K (gene rpoK). The archaebacterial protein is colinear with the C- 
terminal part of the eukaryotic subunit. 

Consensus pattern: [ST]-x-[FY]-E-x-[AT]-R-x-[LIVM]-[GSA]-x-R-[SA]-x-Q 

[ 1] Lu Z., Kutish G.F., Sussman M.D., Rock D.L. Nucleic Acids Res. 21:2940-2940(1993). 
[ 2] McKune K., Woychik N.A. J. Bacteriol. 176:4754-4756(1994). 

483. RNA polymerases L / 13 to 16 Kd subunits signature 

In eukaryotes, there are three different forms of DNA-dependent RNApolymerases (EC 
2.7.7.6 ) transcribing different sets of genes. Each class of RNA polymerase is an assemblage 
of ten to twelve different polypeptides. In archaebacteria, there is generally a single form of 
RNA polymerase which also consist of an oligomeric assemblage of 10 to 13 polypeptides. It 
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has been shown that small subunits of about 13 to 16 Kd found in all three types of 
eukaryotic polymerases are highly conserved. Subunits known to belong to this family are: - 
Budding yeast RPC19 subunit from RNA polymerases I and III [1]. - Budding yeast RPBll 
subunit from RNA polymerase II [2]. - Mammalian RPBll (gene POLR2K) from RNA 
polymerase II. - Caenorhabditis elegans hypothetical protein F58A4.9. - Methanococcus 
jannaschii RNA polymerase subunit L (gene rpoL). - Sulfolobus acidocaldarius RNA 
polymerase subunit L (gene rpoL) [3]. As a signature pattern a conserved region was selected 
which is located at the N-terminal extremity of these polymerase subunits; this region 
contains two cysteines that could play a role in the binding of a metal ion. 

Consensus pattern: [DE](2)-H-[ST]-[LIVM]-[GAP]-N-x(ll)-V-x-[FM]-x(2)-Y-x(3)- H-P 

[ 1] Dequard-Chablat M., Riva M., Carles C, Sentenac A. J. Biol. Chem. 266:15300- 
15307(1991). 

[ 2] Woychik N.A., McKune K., Lane W.S., Young R.A. Gene Expr. 3:77-82(1993). 
[ 3] Langer D. EMBL/GenBank: X70805. 



484. RNA polymerases N / 8 Kd subunits signature 

In eukaryotes, there are three different forms of DNA-dependent RNA polymerases (EC 
2.7.7.6 ) transcribing different sets of genes. Each class of RNA polymerase is an assemblage 
of ten to twelve different polypeptides. In archaebacteria, there is generally a single form of 
RNA polymerase which also consist of an oligomeric assemblage of 10 to 13 polypeptides. 
Archaebacterial subunit N (gene rpoN) [1] is a small protein of about 8 Kd, it is evolutionary 
related [2] to a 8.3 Kd component shared by all three forms of eukaryotic RNA polymerases 
(gene RPBIO in yeast and POLR2J in mammals) as well as to African swine fever virus 
protein CP80R [3]. As a signature pattern a conserved region was selected which is located at 
the N-terminal extremity of these polymerase subunits; this region contains two cysteines that 
could play a role in the binding of a metal ion. 



Consensus pattern: [LIVMF](2)-P-[LIVM]-x-C-F-[ST]-C-G- 
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[ 1] Langer D., Hain J., Thuriaux P., Zillig W. Proc. Natl. Acad. Sci. U.S.A. 92:5768- 
5772(1995). 

[ 2] McKune K., Woychik N.A. J. Bacteriol. 176:4754-4756(1994). 

[ 3] Yanez R.J., Rodriguez J.M., Nogal M.L., Yuste L., Enriquez C, Rodriguez J.F., Vinuela 
E. Virology 208:249-278(1995). 

485. Ribonuclease HII 

[1] Mian IS; Nucleic Acids Res 1997;25:3187-3189. 

486. Ribonuclease PH signature 

Prokaryotic ribonuclease PH (EC 2.7.7.56 ) (RNase PH) [1] is a 

phosphorolyticexoribonuclease that removes nucleotide residues following the -CCA 
terminus of tRNA and adds nucleotides to the ends of RNA molecules by using nucleoside 
diphosphates as substrates. RNase PH is a conserved protein of about 240 amino-acid 
residues. It is evolutionary related to Caenorhabditis elegans hypothetical protein B0564.1.As 
a signature pattern, the most highly conserved region was selected which is located in the 
central part of these proteins. 

Consensus sequence: C-[DE]-[LIVM](2)-Q-[GTA]-D-G-[SG]-x(2)-[TA]-A 
[ 1] Kelly K.O., Deutscher M.P. J. Biol. Chem. 267:17153-17158(1992). 

487. RanBPl domain 

[1] Di Matteo G, Fuschi P, Zerfass K, Moretti S, Ricordy R, Cenciarelli C, Tripodi M, 
Jansen-Durr P, Lavia P; Cell Growth Differ 1995;6:1213-1224. 

488. Rhodanese signatures 

Rhodanese (thiosulfate sulfurtransferase) (EC 2.8.1.1 ) [1,2] is an enzyme which catalyzes the 
transfer of the sulfane atom of thiosulfate to cyanide, to form sulfite and thiocyanate. In 
vertebrates, rhodanese is a mitochondrial enzyme of about 300 amino-acid residues involved 
in forming iron-sulfur complexes and cyanide detoxification. A cysteine residue takes part in 
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the catalytic mechanism. Some bacterial proteins closely related to rhodanese are also 
thought to express a sulfotransferase activity. These are: - Azotobacter vinelandii ihdA. - 
Escherichia coli sseA [3]. - Saccharopolyspora erythraea cysA [4]. - Synechococcus strain 
PCC 7942 rhdA [5]. RhdA is a periplasmic protein probably involved in the transport of 
sulfur compounds. Two patterns for the rhodanese family were developed. They are based on 
highly conserved regions, one which is located in the N-terminal region, the other at the C- 
terminal extremity of the enzyme. 

Consensus pattern: [FY]-x(3)-H-[LIV]-P-G-A-x(2)-[LIVF] 
Consensus pattern: [FY]-[DEAP]-G-[SA]-W-x-E-[FYW] 

[ 1] Westley J. Meth. Enzymol. 77:285-291(1981). 

[ 2] Weiland K.L., Dooley T.P. Biochem. J. 275:227-231(1991). 

[ 3] Rudd K.E. Unpublished observations (1993). 

[ 4] Donadio S., Shafiee A., Hutchinson C.R. J. Bacteriol. 172:350-360(1990). 

[ 5] Laudenbach D.E., Ehrhardt D., Green L., Grossman A.R. J. Bacteriol. 173:2751- 

2760(1991). 

489. Ribonuclease III family signature 

Prokaryotic ribonuclease III (EC 3.1.26.3 ) (gene rnc) [1] is an enzyme that digests double- 
stranded RNA. It is involved in the processing of ribosomal RNA precursors and of some 
mRNAs. RNase III is evolutionary related [2] to the following proteins: - Fission yeast pad, 
a ribonuclease that probably inhibits mating and meiosis by degrading a specific mRNA 
required for sexual development. - Yeast ribonuclease III (gene RNTl), a dsRNA-specific 
nuclease that cleaves eukaryotic preribosomal RNA at various sites. - Caenorhabditis elegans 
hypothetical protein F26E4.13. - Paramecium bursaria chlorella virus 1 protein A464R. - 
Synechocystis strain PCC 6803 hypothetical protein slr0346. - Fission yeast hypothetical 
protein SpAC8A4.08c, a protein with a N-terminal helicase domain and a C-terminal RNase 
III domain. - Caenorhabditis elegans hypothetical protein K12H4.8, a protein with the same 
structure as SpAC8A4.08c.These proteins share regions of sequence similarity; one of which 
is a highly conserved stretch of 9 residues which has been developed as a signature pattern. 
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Consensus pattern: [DEQ]-[RQ]-[LM]-E-[FYW]-[LV]-G-D-[SAR]- 

[ 1] Nashimoto H., Uchida H. Mol. Gen. Genet. 201:25-29(1985). 
[ 2] Mian I.S. Nucleic Acids Res. 25:3187-3195(1997). 

490. Rieske iron-sulfur protein signatures 

Ubiquinol-cytochrome c reductase (EC 1.10.2.2 ) (also known as the bcl complexor complex 
III) is one of the electron transport chains of mitochondria and of some aerobic prokaryotes; 
it catalyzes the oxidoreduction of ubiquinol and cytochrome c. In the chloroplast of plants 
and in cyanobacteria plastoquinone-plastocyanin reductase (EC I.IQ.99.1 ) (also known as the 
b6f complex) is functionally similar and catalyzes the oxidoreduction of plastoquinol and 
cytochrome f. One of the components of these electron transfer systems is an iron-sulfur 
protein with a 2Fe-2S cluster, which is called the Rieske protein [1,2]. The Rieske protein 
contains approximately 190 amino acid residues. The iron-sulfur cluster is complexed to the 
protein through cysteine and histidine residues. Two perfectly conserved regions in Rieske 
proteins contains all the residuesthat bind the iron-sulfur cluster. Both regions contain two 
cysteines and a histidine. The first cysteine and the histidine are 2Fe-2S ligands while the 
remaining cysteines form a disulfide bond [3]. Two conserved regions were selected as 
signature patterns. 

Consensus pattern: C-[TK]-H-L-G-C-[LIVST] [The first C and the H are 2Fe-2S ligands] 
[The second C is involved in a disulfide bond] 

Consensus pattern: C-P-C-H-x-[GSA] [The first C and the H are 2Fe-2S ligands] [The second 
C is involved in a disulfide bond] 

[ 1] Gatti F.L., Meinhardt S.W., Ohnishi T., Tzagoloff A. J. Mol. Biol. 205:421-435(1989). 
[ 2] Kallas T., Spiller S., Malkin R. Proc. Natl. Acad. Sci. U.S.A. 85:5794-5798(1988). 
[ 3] Iwata S., Saynovits M., Link T.A., Michel H. Structure 4:567-579(1996). 

491. Ribosomal protein LI signature 
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Ribosomal protein LI is the largest protein from the large ribosomal subunit.In Escherichia 
coll, LI is known to bind to the 23S rRNA. It belongs to a family of ribosomal proteins 
which, on the basis of sequence similarities [1, 2], groups: - Eubacterial LI. - Algal and plant 
chloroplast LI. - Cyanelle LI. - Archaebacterial LI. - Vertebrate LI OA. - Yeast S SMI. As a 
signature pattern, the best conserved region was selected located in the central section of 
these proteins. It is located at the end of an alpha helix thought to be involved in RNA- 
binding. 

Consensus pattern: [IM]-x(2)-[LIVA]-x(2,3)-[LIVM]-G-x(2)-[LMS]-[GSNH]-[PTKR]- 
[iaRAV]-G-x-[LIMF]-P-[DENSTKO] 

[ 1] Nikonov S.V., Nevskaya N., Eliseikina I.A., Fomenkova N.P., Nikulin A., Ossina N., 
Garber M., Jonsson B.-H., Briand C., Al-Karadaghi S., Svensson L.A., Aevarsson A., Liljas 
A. EMBO J. 15:1350-1359(1996). 

[ 2] Olvera J., Wool I.G. 2.3.CO:2-"Biochem. Biophys. Res. Commun. 220:954-957(1996). 

492. Ribosomal protein LIO signature 

Ribosomal protein LIO is one of the proteins from the large ribosomal subunit. LIO is a 
protein of 162 to 185 amino-acid residues which has only been found so far in eubacteria. A 
conserved region located in the N-terminal section of these proteins was used as a signature 
pattern. 

Consensus pattern: [DEH]-x(2)-[GS]-[LIVMF]-[STN]-[VA]-x-[DEQK]-[LIVMA]-x(2)- 
[LIM]-R 

493. Ribosomal protein LlOe signature 

A number of eukaryotic and archaebacterial ribosomal proteins can be grouped on the basis 
of sequence similarities. One of these families consists of: - Vertebrate LIO (QM) [1]. - Plant 
LIO. - Caenorhabditis elegans LIO (F10B5.1). - Yeast LIO (QSRl). - Methanococcus 
jannaschii MJ0543.These proteins have 174 to 232 amino-acid residues. A conserved region 
located in the central section was selected as a signature pattern. 
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Consensus pattern: R-x-A-[FYW]-G-K-[PA]-x-G-x(2)-A-R-V 

[ 1] Chan Y.-L., Diaz J.-J., Denoroy L., Madjar J.-J., Wool I.G. 2.3.CO:2-"Biochem. 
Biophvs. Res. Commun. 255:952-956(1996). 

494. Ribosomal protein Lll signature 

Ribosomal protein Lll is one of the proteins from the large ribosomal subunit. In 
Escherichia coli, Lll is known to bind directly to the 23S rRNA. It belongs to a family of 
ribosomal proteins which, on the basis of sequence similarities [1,2], groups: 

- Eubacterial Lll. 

- Plant chloroplast Lll (nuclear-encoded). 

- Read algal chloroplast Lll. 

- Cyanelle Lll. 

- Archaebacterial Lll. 

- Mammalian LI 2. 

- Plants LI 2. 

- Yeast L12 (YL15). 

Lll is a protein of 140 to 165 amino-acid residues. A conserved region located in the C- 
terminal section of these proteins was selected as a signature pattern. In Escherichia coli, the 
C-terminal half of Lll has been shown [3] to be in an extended and loosely folded 
conformation and is likely to be buried within the ribosomal structure. 

Consensus pattern: [RKN]-x-[LIVM]-x-G-[ST]-x(2)-[SNO]-[LIVM]-G-x(2)-[LIVM]-x(0,l)- 
[DENG] 

[ 1] Pucciarelli G., Remacha M., Ballesta J.P.G.; Nucleic Acids Res. 18:4409-4416(1990). 
[ 2] Otaka E., Hashimoto T., Mizuta K., Suzuki K.; Protein Seq. Data Anal. 5:301- 
313(1993). 

[ 3] Choli T. Biochem. Int. 19:1323-1338(1989). 
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495. Ribosomal protein L7/L12 C-terminal domain 

[1] Leijonmarck M, Liljas A; J Mol Biol 1987;195:555-579. 

496. Ribosomal protein L13 signature 

Ribosomal protein L13 is one of the proteins from the large ribosomal subunit. 
In Escherichia coli, L13 is known to be one of the early assembly proteins of 
the 508 ribosomal subunit. It belongs to a family of ribosomal proteins which, 
on the basis of sequence similarities [1], groups: - Eubacterial L13. 

- Plant chloroplast LI 3 (nuclear-encoded). - Red algal chloroplast L13. 

- Archaebacterial L13. - Mammalian L13a (Turn P198). - Yeast Rp22 and Rp23. 
Lll is a protein of 140 to 250 amino-acid residues. As a signature pattern, a 
conserved region was selected located in the C-terminal section of these 
proteins. 

Consensus pattern: [LIVM]-[KRV]-[GK]-M-[LIV]-[PS]-x(4,5)-[GS]-[NQEKRA]-x(5)- 
[LIVM]-x-[AIV]-[LFY]-x-[GDN] 

[ 1] Chan Y.-L., Olvera J., Glueck A., Wool I.G. J. Biol. Chem. 269:5589-5594(1994). 

497. Ribosomal protein L13e signature 

A number of eukaryotic ribosomal proteins can be grouped on the basis of 
sequence similarities [1]. One of these families consists of: 

- Vertebrate LI 3 (was previously known as Breast Basic Conserved protein 1 
(BBCl)). - Drosophila L13. - Plant L13. - Yeast probable L13 (YM9375.11c). 

These proteins have 199 to 218 amino-acid residues. As a signature pattern, 
a stretch of about 16 residues in the first third of these proteins selected. 



-Consensus pattern: [KR]-Y-x(2)-K-[LIVM]-R-[STA]-G-[KR]-G-F-[ST]-L-x-E 
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[ 1] Olvera J., Wool I.G. Biochem. Biophys. Res. Commun. 201:102-107(1994). 
498. Ribosomal protein L14 signature 

Ribosomal protein L14 is one of the proteins from the large ribosomal subunit. 
In eubacteria, L14 is known to bind directly to the 23S rRNA. It belongs to a 
family of ribosomal proteins which, on the basis of sequence similarities [1], 
groups: - Eubacterial L14. - Algal and plant chloroplast L14. - Cyanelle L14. 

- Archaebacterial L14. - Yeast L17A. - Mammalian L23. 

- Caenorhabditis elegans L23 (B0336.10). - Higher eukaryotes mitochondrial L14. 

- Yeast mitochondrial Yml38 (gene MRPL38). 

L14 is a protein of 119 to 137 amino-acid residues. As a signature pattern, 

a conserved region located in the C-terminal half of these proteins was selected. 

-Consensus pattern: [GA]-[LIV](3)-x(9,10)-[DNS]-G-x(4)-[FY]-x(2)-[NT]-x(2)-V-[LIV] 

[ 1] Otaka E., Hashimoto T., Mizuta K., Suzuki K. Protein Seq. Data Anal. 5:301- 
313(1993). 

499. Ribosomal protein L15 signature 

Ribosomal protein LI 5 is one of the proteins from the large ribosomal subunit. 
In Escherichia coli, L15 is known to bind the 23S rRNA. It belongs to a family 
of ribosomal proteins which, on the basis of sequence similarities [1], 
groups: - Eubacterial L15. - Plant chloroplast L15 (nuclear-encoded). 

- Archaebacterial L15. - Vertebrate L27a. - Tetrahymena thermophila L29. 

- Fungi L27a (L29, CRP-1, CYH2). 

L15 is a protein of 144 to 154 amino-acid residues. As a signature pattern, 
a conserved region was selected in the C-terminal section of these proteins. 



-Consensus pattern: K-[LIVM](2)-[GASL]-x-[GT]-x-[LIVMA]-x(2,5)-[LIVM]-x-[LIVMF]- 
x(3,4)-[LIVMFCA]-[ST]-x(2)-A-x(3)-[LIVM]-x(3)-G 
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[ 1] Otaka E., Hashimoto T., Mizuta K., Suzuki K. Protein Seq. Data Anal. 5:301- 
313(1993). 

500. Ribosomal protein L15e signature 

A number of eukaryotic and archaebacterial ribosomal proteins can be grouped 
on the basis of sequence similarities [1]. One of these families consists of: 

- Mammalian L15. - Insect L15. - Plant L15. - Yeast YLIO (L13) (Rpl5r). 

- Thermoplasma acidophilum L15. 

These proteins have about 200 amino acid residues. As a signature pattern, 
a conserved region was selected located in the central section. 

-Consensus pattern: [DE]-[KR]-A-R-x-L-G-[FY]-x-[SAP]-x(2)-G-[LIVMFY](4)-R-x- 
[IV]-x-R-G 

[ 1] Zwickl P., Lupas A., Baumeister W. 
Biochem. Biophys. Res. Commun. 209:684-688(1995). 

501. Ribosomal protein L17 signature 

Ribosomal protein LI 7 is one of the proteins from the large ribosomal subunit. 
LI 7 belongs to a family of ribosomal proteins which, on the basis of sequence 
similarities, groups: - Eubacterial L17. 

- Yeast mitochondrial YmL8 (gene MRPL8). 

Eubacterial L17 is a protein of 120 to 130 amino-acid residues. Yeast YmL8 is 
twice larger (238 residues), the sequence of its N-terminal half is colinear 
with that of eubacterial L17. As a signature pattern, a conserved region in 
the N-terminal section was selected. 

-Consensus pattern: I-x-[ST]-[GT]-x(2)-[KR]-x-K-x(6)-[DE]-x-[LIMV]-[LIVMT]-T- 
x-[STAG]-[KR] 

502. Ribosomal protein L18e signature 

A number of eukaryotic and archaebacterial ribosomal proteins can be grouped 
on the basis of sequence similarities. One of these families consists of: 
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- Vertebrate LIS (known as L14 in Xenopus) [1]. - Plant LIS. 

- Yeast LIS (Rp28). - Halobacterium marismortui H129. 

- Sulfolobus acidocaldarius H129e. 

These proteins have 115 to 187 amino-acid residues., A stretch of about 13 residues in the 
5 first third of these proteins has been selected as a signature pattern. 

-Consensus pattern: [KRE]-x-L-x(2)-[PS]-[KR]-x(2)-[RH]-[PSA]-x-[LIVM]-[NS]- 
[LIVM]-x-[RK]-[LIVM] 

[ 1] Puder M., Barnard G.F., Staniunas R.J., Steele G.D. Jr., Chen L.B. 
Biochim. Biophys. Acta 1216:134-136(1993). 

10 

503. Ribosomal LlSp family 
It has been shown that the amino terminal 93 amino acids 
of Swiss:P09895 are necessary and sufficient to bind 5S 
1 5 rRNA in vitro. The carboxyl-terminal half of the protein, 
comprising amino acids 151-296, serves to localize the 
protein to the nucleolus [1]. 
Number of members: 26 
[1] 

2 0 Medline: 96212235 

Distinct domains in ribosomal protein L5 mediate 5 S rRNA 
binding and nucleolar localization. 
Michael WM, Dreyfuss G; 
J Biol Chem 1996;271:11571-11574. 

25 

504. Ribosomal protein LI 9 signature 

Ribosomal protein L19 is one of the proteins from the large ribosomal subunit. 
In Escherichia coli, L19 is known to be located at the 30S-50S ribosomal 

3 0 subunit interface and may play a role in the structure and function of the 

aminoacyl-tRNA binding site. It belongs to a family of ribosomal proteins 
which, on the basis of sequence similarities, groups: - Eubacterial L19. 
- Red algal chloroplast L19. - Cyanelle L19. 
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L19 is a protein of 120 to 130 amino-acid residues., 

A conserved region in the C-terminal section iias been selected as a signature pattern. 
-Consensus pattern: [LIVM]-x-[KRGTI]-x-[GSAI]-[KRQDA]-[VG]-[RSN]-X(0,l)-[KR]- 
[SA]-[KY]-[KLI]-[LYS]-Y-[LIM]-R 

5 

505. Ribosomal protein L19e signature 

A number of eukaryotic and archaebacterial ribosomal proteins can be grouped 
on the basis of sequence sinailarities. One of these families consists of: 
1 0 - Mammalian ribosomal protein L19 [1]. - Drosophila ribosomal protein L19 [2]. 

- Slime mold (D. discoideum) vegetative specific protein V14 [3]. 

- Yeast ribosomal protein L19 (YL14). - Archebacterial ribosomal protein L19E. 
These proteins have 148 to 203 amino-acid residues. 

A stretch of about 20 residues in the N-terminal part of these 
1 5 proteins has been selected as a signature pattern. 

-Consensus pattern: Q-[KR]-R-[LIVM]-x-[SA]-x(4)-[CV]-G-x(3)-[IV]-[WK]-[LIVF]- 
[DN]-P 

[ 1] Chan Y.-L., Lin A., McNally J., Peleg D., Meyuhas O., Wool I.G. 
J. Biol. Chem. 262:1111-1115(1987).[ 2] Hart K., Klein T., Wilcox M. 
2 0 Mech. Dev. 43:101-110(1993).[ 3] Singleton C.K., Manning S.S., Ken R. 
Nucleic Acids Res. 17:9679-9692(1989). 



506. Ribosomal protein Lie signature (Ribosomal_L4) 
25 A number of eukaryotic and archaebacterial ribosomal proteins can be grouped 
on the basis of sequence similarities. One of these families consists [1,2,3, 
4] of: - Vertebrate LI (L4). - Drosophila LI. - Plant LI. - Yeast L2 (Rp2). 

- Fission yeast L2. - Halobacterium marismortui HmaL4 (HL6). 

- Methanococcus jannaschii MJ0177. 

3 0 These proteins have 246 (archaebacteria) to 427 (human) amino acids. A conserved region 
in the N-terminal part of these proteins has been selected as a signature pattern. 
-Consensus pattern: N-x(3)-[KRM]-x(2)-A-[LIVT]-x-S-A-[LIV]-x-A-[ST]-[SGA]- 
x(7)-[RK]-[GS]-H 
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[ 1] Rafti F., Gargiulo G., Manzi A., Malva C, Graziani F. 

Nucleic Acids Res. 17:456-456(1989).[ 2] Presutti C, Villa T., Bozzoni I. 

Nucleic Acids Res. 21:3900-3900(1993). 
[ 3] Bagni C, Mariottini P., Annesi P., Amaldi F. 

Biochim. Biophys. Acta 1216:475-478(1993). 
[ 3] Arndt E., Kroemer W., Hatakeyama T. J. Biol. Chem. 265:3034-3039(1990). 

507. Ribosomal protein 12 signature 

Ribosomal protein L2 is one of the proteins from the large ribosomal subunit. 
In Escherichia coli, L2 is known to bind to the 23S rRNA and to have 
peptidyltransferase activity. It belongs to a family of ribosomal proteins 
which, on the basis of sequence similarities [1,2], groups: - Eubacterial L2. 

- Algal and plant chloroplast L2. - Cyanelle L2. - Archaebacterial L2. 

- Plant L2. - Slime mold L2. - Marchantia polymorpha mitochondrial L2. 

- Paramecium tetraurelia mitochondrial L2. - Fission yeast K5, K37 and KD4. 

- Yeast YL6. - Vertebrate L8. 

The best conserved region located in the C-terminal section of these proteins has been 

selected as 

a signature pattern. 

-Consensus pattern: P-x(2)-R-G-[STAIV](2)-x-N-[APK]-x-[DE] 
[ 1] Marty I., Meyer Y. 

Nucleic Acids Res. 20:1517-1522(1992). 
[ 2] Otaka E., Hashimoto T., Mizuta K., Suzuki K. 

Protein Seq. Data Anal. 5:301-313(1993). 

508. Ribosomal protein L20 signature 

Ribosomal protein L20 is one of the proteins from the large ribosomal subunit. 
In Escherichia coli, L20 is known to bind directly to the 23S rRNA. It belongs 
to a family of ribosomal proteins which, on the basis of sequence 
similarities [1], groups: - Eubacterial L20. - Algal and plant chloroplast L20. 
- Cyanelle L20. 
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L20 is a protein of about 120 amino-acid residues. A conserved region located in the 
central section of these proteins has been selected as a signature pattern. 
-Consensus pattern: K-x(3)-[KRC]-x-[LIVM]-W-[IV]-[STNALV]-R-[LIVM]-[NS]-x(3)- 
[RKHS] 

[ 1] Otaka E., Hashimoto T., Mizuta K., Suzuki K. 
Protein Seq. Data Anal. 5:301-313(1993). 

509. Ribosomal protein L21e signature 

A number of eukaryotic and archaebacterial ribosomal proteins can be grouped 
on the basis of sequence similarities. One of these families consists of: 

- Mammalian L21 [1]. - Entamoeba histolytica L21 [2]. 

- Caenorhabditis elegans L21 (C14B9.7). - Yeast L21E (URPl) [3]. 

- Halobacterium marismortui HL31 [4]. 

These proteins have 160 (eukaryotes) or 95 (archebacteria) amino-acid 
residues. A conserved region in the central part of these proteins has been selected 
as a signature pattern. 

-Consensus pattern: G-[DE]-x-V-x(10)-[GV]-x(2)-[FYH]-x(2)-[FY]-x-G-x-T-G 
[ 1] Devi K.R.G., Chan Y.-L., Wool I.G. 

Biochem. Biophys. Res. Commun. 162:364-370(1989). 
[ 2] Fetter R., Rozenblatt S., Nuchamovv'itz Y., Mirelman D. 

Mol. Biochem. Parasitol. 56:329-333(1992). 
[ 3] Jank B., Waldherr M., Schweyen R.J. Curr. Genet. 23:15-18(1993). 
[ 4] Hatakeyama T., Kimura M. Eur. J. Biochem. 172:703-711(1988). 

510. Ribosomal protein L21 signature 

Ribosomal protein L21 is one of the proteins from the large ribosomal subunit. 
In Escherichia coli, L21 is known to bind to the 23S rRNA in the presence of 
L20. It belongs to a family of ribosomal proteins which, on the basis of 
sequence similarities, groups: - Eubacterial L21. 

- Marchantia polymorpha chloroplast L21. - Cyanelle L21. 

- Spinach chloroplast L21 (nuclear-encoded). 
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Eubacterial L21 is a protein of about 100 amino-acid residues, the mature form 
of the spinach chloroplast L21 has 200 residues. A conserved region located in the 
C-terminal section of these proteins has been selected as a signature pattern. 
-Consensus pattern: [IVT]-x(3)-[KR]-x(3)-[KRQ]-K-x(6)-G-[HF]-R-[RQ]-x(2)-[ST] 

511. Ribosomal protein L22 signature 

Ribosomal protein L22 is one of the proteins from the large ribosomal subunit. 
In Escherichia coli, L22 is known to bind 23S rRNA. It belongs to a family of 
ribosomal proteins which, on the basis of sequence similarities [1,2,3], 
groups: - Eubacterial L22. 

- Algal and plant chloroplast L22 (in legumes L22 is encoded in the nucleus 
instead of the chloroplast). - Cyanelle L22. - Archaebacterial L22. 

- Mammalian L17. - Plant L17. - Yeast YL17. 

A conserved region located in the C- terminal section of these proteins has 
been selected as a signature pattern. 

-Consensus pattern: [RKQN]-x(4)-[RH]-[GAS]-x-G-[KRQS]-x(9)-[HDN]-[LIVM]-x- 

[LIVMS]-x-[LIVM] 
[ 1] Gantt J.S., Baldauf S.L., Calie P.J., Weeden N.F., Palmer J.D. 

EMBO J. 10:3073-3078(1991).[ 2] Madsen L.H., Kreiberg J.D., Causing K. 

Curr. Genet. 19:417-422(1991). 
[ 3] Otaka E., Hashimoto T., Mizuta K., Suzuki K. 

Protein Seq. Data Anal. 5:301-313(1993). 

512. Ribosomal protein L23 signature 

Ribosomal protein L23 is one of the proteins from the large ribosomal subunit. 
In Escherichia coli, L23 is known to bind a specific region on the 23S rRNA; 
in yeast, the corresponding protein binds to a homologous site on the 26S rRNA 
[l].It belongs to a family of ribosomal proteins which, on the basis of 
sequence similarities [2,3,4], groups: - Eubacterial L23. 

- Algal and plant chloroplast L23. - Archaebacterial L23. - Mammalian L23A. 

- Caenorhabditis elegans L23A (F55D1G.2). - Fungi L25. 
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- Yeast mitochondrial YmL41 (gene MRPL41 or MRP20). 

A small conserved region in the C-terminal section of these proteins, which is 
probably involved in rRNA-binding has been selected as a signature pattern [2]. 
-Consensus pattern: [RK](2)-[AM]-[IVFYT]-[IV]-[RKT]-L-[STANEQK]-x(7)-[LIVMFT] 
[ 1] El Baradi T.T.A.L., Raue H. A., van de Regt C.H.F., Verbree B.C., 

Planta R.J. EMBO J. 4:210-2107(1985). 
[ 2] Raue H.A., Otaka E., Suzuki K. J. Mol. Evol. 28:418-426(1989). 
[ 3] Fearon K., Mason T.L. J. Biol. Chem. 267:5162-5170(1992). 
[ 4] Otaka E., Hashimoto T., Mizuta K. 

Protein Seq. Data Anal. 5:285-300(1993). 

513. Ribosomal protein L24 signature 

Ribosomal protein L24 is one of the proteins from the large ribosomal subunit. 
L24 belongs to a family of ribosomal proteins vv'hich, on the basis of sequence 
similarities, groups: - Eubacterial L24. 

- Plant chloroplast L24 (nuclear-encoded). - Red algal L24. - Vertebrate L26. 

- Yeast L26 (YL33). - Archaebacterial HmaL24 (HL15). 

- A probable ribosomal protein from Sulfolobus acidocaldarius [1]. 

In their mature form, these proteins have 103 to 150 amino-acid residues. 

A conserved stretch of 20 residues in their N-terminal section has been selected as a 

signature pattern. 

-Consensus pattern: [GDEN]-D-x-V-x-[IV]-[LIVMA]-x-G-x(2)-[KRA]-[GNQ]-x(2,3)- 

[GA]-x-[IV] 
[ 1] Ouzounis C, Kyrpides N., Sander C. 

Nucleic Acids Res. 23:565-570(1995). 

514. Ribosomal protein L24e signature 

A number of eukaryotic and archaebacterial ribosomal proteins can be grouped 
on the basis of sequence similarities. One of these families consists [1] of: 

- Mammalian ribosomal protein L24. 
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- Yeast ribosomal protein L30A/B (Rp29) (YL21). 

- Kluyveromyces lactis ribosomal protein L30. \ 

- Arabidopsis thaliana ribosomal protein L24 homolog. 

- Haloarcula marismortui ribosomal protein HL21/HL22. 
5 - Methanococcus jannaschii MJ1201. 

These proteins have 60 to 160 amino-acid residues. The most conserved region, which is 
located in the N-terminal region of these proteins has been selected as a signature pattern. 
-Consensus pattern: [FY]-x-[GSH]-x(2)-[IV]-x-P-G-x-G-x(2)-[FYV]-x-[KRHE]-x-D 
[ 1] Chan Y.-L., Olvera J., Wool I.G. 
10 Biochem. Biophys. Res. Commun. 202:1176-1180(1994). 

515. Ribosomal protein L27 signature 

Ribosomal protein L27 is one of the proteins from the large ribosomal subunit. 
1 5 L27 belongs to a family of ribosomal proteins which, on the basis of sequence 
similarities [1,2], groups: - Eubacterial L27. 

- Plant chloroplast L27 (nuclear-encoded). - Algal chloroplast L27. 

- Yeast mitochondrial YmL2 (gene MRPL2 or MRP7). 

The schematic relationship between these groups of proteins is shown below. 
2 0 Eub. L27 Nxxxxxxxxx Algal L27 Nxxxxxxxxx 
Plant L27 tttttNxxxxxxxxxxxxx 

Yeast MRP7 tttNxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx 

***'t': transit peptide. 
'N': N-terminal of mature protein.'*': position of the pattern. 
25 -Consensus pattern: G-x-[LIVM](2)-x-R-Q-R-G-x(5)-G 

[ 1] Eihag G.A., Bourque D.P. Biochemistry 31:6856-6864(1992). 
[ 2] Otaka E., Hashimoto T., Mizuta K. 

Protein Seq. Data Anal. 5:285-300(1993). 

30 

516. Ribosomal L28 family 

The ribosomal 28 family includes L28 proteins from bacteria 
and chloroplasts. The L24 protein from yeast Swiss:P36525 
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also contains a region of similarity to prokaryotic L28 
proteins. L24 from yeast is also found in the large 
ribosomal subunit 
Number of members: 24 

5 

517. Ribosomal protein L29 signature 

Ribosomal protein L29 is one of the proteins from the large ribosomal subunit. 
L29 belongs to a family of ribosomal proteins which, on the basis of sequence 
10 similarities [1], groups: - Eubacterial L29. - Red algal L29. 

- Archaebacterial L29. - Mammalian L35 - Caenorhabditis elegans L35 (ZK652.4). 

- Yeast L35. 

L29 is a protein of 63 to 138 amino-acid residues. 

A conserved region located in the central section of L29 has been selected as a 
15 signature pattern. 

-Consensus pattern: [KNQS]-[PSTL]-x(2)-[LIMFA]-[KRGSAN]-x-[LIVYSTA]-[KR]- 

[KRHOS]-[DESTANRL]-[LIV]-A-[KRCQVT]-[LIVMA] 
[ 1] Otaka E., Hashimoto T., Mizuta K. 
Protein Seq. Data Anal. 5:285-300(1993). 

20 

518. Ribosomal protein L3 signature 

Ribosomal protein L3 is one of the proteins from the large ribosomal subunit. 
In Escherichia coli, L3 is known to bind to the 23S rRNA and may participate 
25 in the formation of the peptidyltransferase center of the ribosome. It belongs 
to a family of ribosomal proteins which, on the basis of sequence 
similarities [1,2,3,4], groups: - Eubacterial L3. - Red algal L3. - Cyanelle L3. 

- Archaebacterial Halobacterium marismortui HmaL3 (HLl). 

- Yeast L3 (also known as trichodermin resistance protein) (gene TCMl). 
3 0 - Arabidopsis thaliana L3 (genes ARPl and ARP2). - Mammalian L3 (L4). 

- Mammalian mitochondrial L3. - Yeast mitochondrial YmL9 (gene MRPL9). 

A conserved region located in the central section of these proteins has been selected 
as a signature pattern. 
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-Consensus pattern: [FL]-x(6)-[DN]-x(2)-[AGS]-x-[ST]-x-G-[KRH]-G-x(2)-G-x(3)-R 
[ 1] Arndt E., Kroemer W., Hatakeyama T. J. Biol. Chem. 265:3034-3039(1990). 
[ 2] Graack H.-R., Grohmann L., Kitakawa M., Schaefer K.L., Kruft V. 

Eur. J. Biochem. 206:373-380(1992). 
5 [ 3] Herwig S., Kruft V., Wittmann-Liebold B. 

Eur. J. Biochem. 207:877-885(1992). 
[ 4] Otaka E., Hashimoto T., Mizuta K., Suzuki K. 

Protein Seq. Data Anal. 5:301-313(1993). 

10 

519. Ribosomal protein L30 signature 

Ribosomal protein L30 is one of the proteins from the large ribosomal subunit. 
L30 belongs to a family of ribosomal proteins which, on the basis of sequence 
similarities [1], groups: - Eubacterial L30. - Archaebacterial L30. 
15 - Drosophila L7. - Slime mold L7. - Mammalian L7. - Fungi L7 (YL8). 
- Yeast mitochondrial L33. 

L30 from eubacteria are small proteins of about 60 residues, those from 
archaebacteria are proteins of about 150 residues. Eukaryotic L7 are proteins 
of about 250 to 270 residues. The schematic relationship between the three 
2 0 groups of proteins is shown below.Eub. L30 NxxxxxxxxxxC 
Arc. L30 NxxxxxxxxxxxxxxxxxxxxxxxxxxxC 

Euk. L7 NxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxC 

*******'*': position of the pattern. 
The signature pattern for this family of ribosomal proteins spans the 
2 5 N-terminal half of the region common to all these proteins. 

-Consensus pattern: [IVT]-[LIVM]-x(2)-[LF]-x-[LI]-x-[KRHQEG]-x(2)-[STNQH]-x- 

[IVT]-x(10)-[LMS]-[LIV]-x(2)-[LIVA]-x(2)-[LMFY]-[IVT] 
"[ 1] Mizuta K., Hashimoto T., Otaka E. 
Nucleic Acids Res. 20:1011-1016(1992). 

30 



520. Ribosomal protein L31 signature 

Ribosomal protein L31 is one of the proteins from the large ribosomal subunit. 
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L31 is a protein of 66 to 97 amino-acid residues which has only been found so 
far in eubacteria and in some algal chloroplasts. 

A conserved region located in the central section of these proteins has been selected as 
a signature pattern. 

-Consensus pattern: H-P-F-[FY]-[TI]-x(9)-G-R-[AIV]-x-[KRQ] 
521. Ribosomal protein L31e signature 

A number of eukaryotic and archaebacterial ribosomal proteins can be grouped 
on the basis of sequence similarities. One of these families consists of: 

- Mammalian L31 [1]. - Chlamydomonas reinhardtii L31. - Yeast L34. 

- Halobacterium marismortui HL30 [2]. 

These proteins have 87 to 128 amino-acid residues. 

A conserved region, located in the central section has been selected as a signature pattern. 
-Consensus pattern: V-[KR]-[LIVM]-x(3)-[LIVM]-N-x-[AKH]-x-W-x-[KR]-G 
[ 1] Tanaka T., Kuwano Y., Kuzumaki T., Ishikawa K., Ogata K. 

Eur. J. Biochem. 162:45-48(1987).[ 2] Bergmann U., Arndt E. 

Biochim. Biophys. Acta 1050:56-60(1990). 

522. Ribosomal protein L33 signature 

Ribosomal protein L33 is one of the proteins from the large ribosomal subunit. 
In Escherichia coll, L33 has been shown to be on the surface of SOS subunit. 
L33 belongs to a family of ribosomal proteins which, on the basis of sequence 
similarities [1,2,3], groups: - Eubacterial L33. 
- Algal and plant chloroplast L33. - Cyanelle L33. 

L33 is a small protein of 49 to 66 amino-acid residues. A conserved region located 

in the central section of L33 has been selected as a signature pattern. 

-Consensus pattern: Y-x-[ST]-x-[KR]-[NS]-x(4)-[PATQ]-x(l,2)-[LIVM]-[EA]-x(2)- 

K-[FY]-[CSD] 

[ 1] Kruft v., Kapp U., Wittmann-Liebold B. Biochimie 73:855-860(1991). 
[ 2] Sharp P.M. Gene 139:129-130(1994). 
[ 3] Otaka E., Hashimoto T., Mizuta K. 
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Protein Seq. Data Anal. 5:285-300(1993). 

523. Ribosomal protein L34 signature 

Ribosomal protein L34 is one of the proteins from the large subunit of the prokaryotii 
ribosome. It is a small basic protein of 44 to 51 amino-acid residues [1]. L34 belongs to a 
family of ribosomal proteins which, on the basis of sequence similarities, groups: - 
Eubacterial L34. 

- Red algal chloroplast L34. - Cyanelle L34. 

A conserved region that corresponds to the N-terminal half of L34 has been selected 
as a signature pattern. 

-Consensus pattern: K-[RG]-T-[FYWL]-[EQS]-x(5)-[KRHS]-x(4,5)-G-F-x(2)-R 
[ 1] Old I.G., Margarita D., Saint Girons I. 
Nucleic Acids Res. 20:6097-6097(1992). 

524. Ribosomal protein L34e signature 

A number of eukaryotic and archaebacterial ribosomal proteins can be grouped 
on the basis of sequence similarities. One of these families consists of: 

- Mammalian L34. - Mosquito L31 [1]. - Plant L34 [2]. 

- Yeast putative ribosomal protein YIL052c. - Methanococcus jannaschii MJ0655. 
These proteins have 89 to 129 amino-acid residues. 

A conserved region located in the N-terminal section of these proteins has been 
selected as a signature pattern. 

-Consensus pattern: Y-x-[ST]-x-S-[NY]-x(5)-[KR]-T-P-G 
[ 1] Lan O., Niu L.L., Fallon A.M. 

Biochim. Biophys. Acta 1218:460-462(1994). 
[ 2] Gao J., Kim S.R., Chung Y.Y., Lee J.M., An G. 

Plant Mol. Biol. 25:761-770(1994). 

525. Ribosomal protein L35Ae signature 

A number of eukaryotic and archaebacterial ribosomal proteins can be grouped 
on the basis of sequence similarities. One of these families consists of: 
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- Vertebrate L35A. - Caenorhabditis elegans L35A (F10E7.7). 

- Yeast L37A/L37B (Rp47). - Pyrococcus woesei L35A homolog [1]. 
These proteins have 87 to 110 amino-acid residues. 

A highly conserved stretch of 22 residues in the C-terminal part of 
these proteins has been selected as a signature pattern. 

-Consensus pattern: G-K-[LIVM]-x-R-x-H-G-x(2)-G-x-V-x-A-x-F-x(3)-[LI]-P 
[ 1] Ouzounis C, Kyrpides N., Sander C. 
Nucleic Acids Res. 23:565-570(1995). 

526. Ribosomal protein L36 signature 

Ribosomal protein L36 is the smallest protein from the large subunit of the prokaryotic 
ribosome. It belongs to a family of ribosomal proteins which, on the basis of sequence 
similarities [1], groups: - Eubacterial L36. - Algal and plant chloroplast L36. - Cyanelle 
L36.L36 is a small basic and cysteine-rich protein of 37 amino-acid residues. As a signature 
pattern, a conserved region that corresponds to positions 11 to 36 in L36 and includes three 
conserved cysteine residues has been developed. 

Consensus pattern: C-x(2)-C-x(2)-[LIVM]-x-R-x(3)-[LIVMN]-x-[LIVM]-x-C-x(3,4)- [KR]- 
H-x-O-x-Q- 

[ 1] Otaka E., Hashimoto T., Mizuta K. Protein Seq. Data Anal. 5:285-300(1993). 
527. Ribosomal protein L36e signature 

A number of eukaryotic ribosomal proteins can be grouped on the basis of 
sequence similarities. One of these families consists of: - Mammalian L36 [1]. 

- Drosophila L36 (M(l)lB). - Caenorhabditis elegans L36 (F37C12.4). 

- Candida albicans L39. - Yeast YL39. 

These proteins have 99 to 104 amino acids. 

A conserved region in the central part of these proteins has been selected as a signature 
pattern. 

-Consensus pattern: P-Y-E-[KR]-R-x-[LIVM]-[DE]-[LIVM](2)-[KR] 
[ 1] Chan Y.-L., Paz V., Olvera J., Wool I.G. 
Biochem. Biophys. Res. Commun. 192:849-853(1993). 
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528. Ribosomal protein L39e signature 

A number of eukaryotic and archaebacterial ribosomal proteins can be grouped 
on the basis of sequence similarities. One of these families consists of: 

- Mammalian L39 [1], - Plants L39. - Yeast L46 [2]. - Archebacterial L39e [3]. 
These proteins are very basic. About 50 residues long, they are the smallest 
proteins of eukaryotic-type ribosomes. A conserved region in the C-terminal 
section of these proteins has been selected as a signature pattern. 

-Consensus pattern: [KRA]-T-x(3)-[LIVM]-[KRQF]-x-[NHS]-x(3)-R-[NHY]-W-R-R 

[ 1] Lin A., McNally J., Wool I.G. J. Biol. Chem. 259:487-490(1984). 

[ 2] Leer R.J., van Raamsdonk-Duin M.M.C., Kraakman P., Mager W.H., 
Planta R.J. Nucleic Acids Res. 13:701-709(1985). 

[ 3] Ramirez C, Louie K.A., Matheson A.T. FEES Lett. 250:416-418(1989). 



529. Ribosomal L40e family 
Bovine L40 has been identified as a secondary RNA binding 
protein [1]. lAG is fused to a ubiquitin protein [2]. 

Number of members: 27 
[1] 

Medline: 88203200 

RNA binding proteins of the large subunit of bovine 
mitochondrial ribosomes. 
Piatyszek MA, Denslow ND, O'Brien TW; 
Nucleic Acids Res 1988;16:2565-2583. 
[2]Medline: 96011832 
The carboxyl extensions of two rat ubiquitin fusion proteins 
are ribosomal proteins S27a and L40. 
Chan YL, Suzuki K, Wool IG; 
Biochem Biophys Res Commun 1995;215:682-690. 
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530. (Ribosomal L44) Ribosomal protein L44e signature 

A number of eukaryotic and archaebacterial ribosomal proteins can be grouped 
on the basis of sequence similarities. One of these families consists of: 

- Mammalian L44 [1]. - Trypanosoma brucei L44. 

- Caenorhabditis elegans L44 (C09H10.2). - Fungal L44 (L41). 

- Halobacterium marismortui LA [2]. 

These proteins have 92 to 105 amino-acid residues. 

A conserved region located in the C-terminal part of these proteins has been 
selected as a signature pattern. 

-Consensus pattern: K-x-[TV]-K-K-x(2)-L-[KR]-x(2)-C 
[ 1] Gallagher M.J., Chan Y.-L., Lin A., Wool I.G. DNA 7:269-273(1988). 
[ 2] Bergmann U., Wittmann-Liebold B. 
Biochim. Biophys. Acta 1173:195-200(1993 



531. Ribosomal protein L5 signature 

Ribosomal protein L5 is one of the proteins from the large ribosomal subunit. 
In Escherichia coli, L5 is known to be involved in binding 5S RNA to the large 
ribosomal subunit. It belongs to a family of ribosomal proteins which, on the 
basis of sequence similarities [1,2,3,4], groups: - Eubacterial L5. 

- Algal chloroplast L5. - Cyanelle L5. - Archaebacterial L5. - Mammalian Lll. 

- Tetrahymena thermophila L21. - Slime mold L5 (V18). - Yeast L16 (39A). 

- Plants mitochondrial L5. 

L5 is a protein of about 180 amino-acid residues. 

A conserved region, located in the first third of these 

proteins has been selected as a signature pattern. 

-Consensus pattern: [LIVM]-x(2)-[LIVM]-[STAVC]-[GEl-[QV]-x(2)-[LIVMA]-x-[ 
x-[STAG]-[KRH]-x-[STA] 

[ 1] Hatakeyama T., Hatakeyama T. Biochim. Biophys. Acta 1039:343-347(1990). 
[ 2] Rosendahl G., Andreasen P.H., Kristiansen K. Gene 98:161-167(1991). 
[ 3] Yang D., Gunther I., Matheson A.T., Auer J., Spicker G., Boeck A. 

Biochimie 73:679-682(1991). 
[ 4] Otaka E., Hashimoto T., Mizuta K., Suzuki K. 
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Protein Seq. Data Anal. 5:301-313(1993). 

532. ribosomal L5P family C-terminus 

This region is found associated with Ribosomal_L5. 
Number of members: 60 

533. Ribosomal protein L6 signatures 

Ribosomal protein L6 is one of the proteins from the large ribosomal subunit. In 
Escherichia coli, L6 is known to bind directly to the 23S rRNA and is located at the 
aminoacyl-tRNA binding site of the peptidyltransferase center. It belongs to a family of 
ribosomal proteins which, on the basis of sequence similarities [1,2,3,4], groups: - 
Eubacterial L6. 

Algal chloroplast L6. 

Cyanelle L6. 

Archaebacterial L6. 

- Marchantia polymorpha mitochondrial L6. 

- Yeast mitochondrial YmL6 (gene MRPL6). 
Mammalian L9. 

Drosophila L9. 

- Plants L9. 

- Yeast L9 (YLll). 

While all the above proteins are evolutionary related it is very difficult to derive a 
pattern that will find them all. Two patterns were therefore created, the first to detect 
eubacterial, cyanelle and mitochondrial L6, the second to detect archaebacterial L6 as well as 
eukaryotic L9. 

-Consensus pattern: [PS]-[DENS]-x-Y-K-[GA]-K-G-[LIVM] 

-Consensus pattern: Q-x(3)-[LIVM]-x(2)-[KR]-x(2)-R-x-F-x-D-G-[LIVM]-Y-[LIVM]-x(2)- 
[KR] 

[1] Suzuki K., Olvera J., Wool I.G. Gene 93:297-300(1990). 

[2] Schwank S., Harrer R., Schueller H.-J., Schweizer E. Curr. Genet. 24:136-140(1993). 
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[3] Golden B.L., Ramakrishnan V., White S.W. EMBO J. 12:4901-4908(1993). 

[ 4] Otaka E., Hashimoto T., Mizuta K., Suzuki K. Protein Seq. Data Anal. 5:301-313(1993). 

534. Ribosomal protein L6e signature 

A number of eukaryotic and archaebacterial ribosomal proteins can be grouped 
on the basis of sequence similarities. One of these families consists of: 

- Mammalian ribosomal protein L6 (L6 was previously known as TAX-responsive 
enhancer element binding protein 107). 

- Caenorhabditis elegans ribosomal protein L6 (R151.3). 

- Yeast ribosomal protein YL16A/YL16B. 

- Mesembryanthemum crystallinum ribosomal protein YL16-like. 

These proteins have 175 (yeast) to 287 (mammalian) amino acids. A highly conserved 

region in the central part of these proteins has been selected as a signature 

pattern. 

-Consensus pattern: N-x(2)-P-L-R-R-x(4)-[FY]-V-I-A-T-S-x-K 



535. Ribosomal protein L7Ae signature 

A number of eukaryotic and archaebacterial ribosomal proteins can be grouped 
on the basis of sequence similarities. One of these families consists of: 

- Vertebrate L7A (SURF3) [1]. - Plant L7A. - Yeast L7A (YL5) (Rp6). 

- Yeast protein NHP2 [2]. - Yeast hypothetical protein YEL026w. 

- Bacillus subtilis hypothetical protein ylxQ. - Halobacterium marismortui Hs6. 

- Methanococcus jannaschii MJ1203. 

These proteins have 100 to 265 amino-acid residues. 

A conserved region located in the central section has been selected as a signature pattern. 
-Consensus pattern: [CA]-x(4)-[IV]-P-[FY]-x(2)-[LIVM]-x-[GSQ]-[KRQ]-x(2)-L-G 
[ 1] Colombo P., Yon J., Garson K., Fried M. 

Proc. Natl. Acad. Sci. U.S.A. 89:6358-6362(1992). 
[ 2] Kolodrubetz D., Burgum A. Yeast 7:79-90(1991). 
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536. Ribosomal protein L9 signature 

Ribosomal protein L9 is one of the proteins from the large ribosomal subunit. 
In Escherichia coli, L9 is known to bind directly to the 23S rRNA. It belongs 
to a family of ribosomal proteins which, on the basis of sequence similarities 
[1,2], groups: - Eubacterial L9. - Cyanobacterial L9. 
- Plant chloroplast L9 (nuclear-encoded). - Red algal chloroplast L9. 
A conserved region, located in the N-terminal section of these proteins has been selected 
as a signature pattern. 

-Consensus pattern: G-x(2)-[GN]-x(4)-V-x(2)-G-[FY]-x(2)-N-[FY]-L-x(5)-[GA]- 
x(3)-[STN] 

[ 1] Hoffman D.W., Davies C., Gerchman S.E., Kycia J.H., Porter S.J., 

White S.W., Ramakrishnan V. EMBO J. 13:205-212(1994). 
[ 2] Otaka E., Hashimoto T., Mizuta K., Suzuki K. 

Protein Seq. Data Anal. 5:301-313(1993). 

537. Ribosomal protein SIO signature 

Ribosomal protein SIO is one of the proteins from the small ribosomal subunit. 
In Escherichia coli, SIO is known to be involved in binding tRNA to the 
ribosomes. It belongs to a family of ribosomal proteins which, on the basis 
of sequence similarities [1], groups: - Eubacterial 810. 

- Algal chloroplast SIO. - Cyanelle SIO. - Archaebacterial SIO. 

- Marchantia polymorpha and Prototheca wickerhamii mitochondrial SIO. 

- Arabidopsis thaliana mitochondrial SIO (nuclear encoded). - Vertebrate S20. 

- Plant S20. - Yeast URP2. 

SIO is a protein of about 100 amino-acid residues. 

A conserved region located in the center of these proteins has been selected as a signatui 
pattern. 

-Consensus pattern: [AV]-x(3)-[GDNSR]-[LIVMSTA]-x(3)-G-P-[LIVM]-x-[LIVM]-P- 
[ 1] Otaka E., Hashimoto T., Mizuta K. 
Protein Seq. Data Anal. 5:285-300(1993). 
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538. Ribosomal protein Sll signature 

Ribosomal protein Sll [1] plays an essential role in selecting the correct 
tRNA in protein biosynthesis. It is located on the large lobe of the small 
ribosomal subunit. Sll belongs to a family of ribosomal proteins which, on the 
basis of sequence similarities, groups [2]: - Eubacterial Sll. 

- Algal and plant chloroplast Sll. - Cyanelle Sll. - Archaebacterial Sll. 

- Marchantia polymorpha and Prototheca wickerhamii mitochondrial Sll. 

- Acanthamoeba castellanii mitochondrial Sll. - Neurospora crassa S14 (crp-2). 

- Yeast S14 (RP59 or CRYl). 

- Mammalian, Drosophila, Trypanosoma, and plant S14. 

- Caenorhabditis elegans S14 (F37C12.9). 

One of the best conserved regions in these proteins was selected as a signature 
pattern. 

-Consensus pattern: [LIVMF]-x-[GSTAC]-[LIVMF]-x(2)-[GSTAL]-x(0,l)-[GSN]- 
[LlVMF]-x-[LIVM]-x(4)-[DEN]-x-T-P-x-[PA]-[STCH]-[DN] 

[ 1] Kimura M., Kimura J., Hatakeyama T. FEES Lett. 240:15-20(1988). 

[ 2] Otaka E., Hashimoto T., Mizuta K. 
Protein Seq. Data Anal. 5:285-300(1993). 

539. Ribosomal protein S12 signature 

Ribosomal protein S12 is one of the proteins from the small ribosomal subunit. 
In Escherichia coli, S12 is known to be involved in the translation initiation 
step. It is a very basic protein of 120 to 150 amino-acid residues. S12 
belongs to a family of ribosomal proteins which, on the basis of sequence 
similarities [1], groups: - Eubacterial S12. - Archaebacterial 812. 

- Algal and plant chloroplast 812. - Cyanelle S12. 

- Protozoa and plant mitochondrial 812. - Yeast 828. 

- Drosophila mitochondrial protein tko (Technical KnockOut). - Mammalian 823. 
The best conserved regions in these proteins, located in the center of each 
sequence have been selected as a signature pattern. 

-Consensus pattern: [RK]-x-P-N-S-[AR]-x-R 
[ 1] Otaka E., Hashimoto T., Mizuta K. 
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Protein Seq. Data Anal. 5:285-300(1993). 

540. Ribosomal protein S12e signature 

A number of eukaryotic ribosomal proteins can be grouped on tlie basis of 
sequence similarities. One of these families consists of: - Vertebrate S12 [1]. 

- Trypanosoma brucei S12 [2]. - Caenorhabditis elegans S12 (F54E7.2). 

- Drosophila SI 2. - Yeast SI 2. 

These proteins have 130 to 150 amino acids. 

A conserved region in the N-terminal part of these proteins has been selected 
as a signature pattern. 

-Consensus pattern: A-L-[KROP]-x-V-L-x(2)-[SA]-x(3)-[DN]-G-L 
[ 1] Lin A., Chan Y.-L., Jones R., Wool I.G. 

J. Biol. Chem. 262:14343-14351(1987).[ 2] Marchal C, Ismaili N., Pays E. 

Mol. Biochem. Parasitol. 57:331-334(1993). 

541. Ribosomal protein 813 signature 

Ribosomal protein S13 is one of the proteins from the small ribosomal subunit. 
In Escherichia coli, S13 is known to be involved in binding fMet-tRNA and, 
hence, in the initiation of translation. It is a basic protein of 115 to 177 
amino-acid residues and belongs to a family of ribosomal proteins which, on 
the basis of sequence similarities [1,2], groups: - Eubacterial S13. 

- Plant chloroplast S13 (nuclear encoded). - Red algal chloroplast S13. 

- Cyanelle S13. - Archaebacterial 813. - Plant mitochondrial S13. 

- Mammalian and plant SIS. 

The best conserved regions in these proteins, located in their C-terminal 
part have been selected as a signature pattern. 

-Consensus pattern: [KRQS]-G-x-R-H-x(2)-[GSNH]-x(2)-[LIVMC]-R-G-Q 
[ 1] Chan Y.-L., Paz V., Wool I.G. 

Biochem. Biophys. Res. Commun. 178:1212-1218(1991). 
[ 2] Otaka E., Hashimoto T., Mizuta K. 

Protein Seq. Data Anal. 5:285-300(1993). 
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542. Ribosoma] protein S14p/S29e (Ribosomal protein S14 signature) 

Ribosomal protein S14 is one of the proteins from the small ribosomal subunit. In 
5 Escherichia coli, S14 is known to be required for the assembly of 308 particles and may also 
be responsible for determining the conformation of 16S rRNA at the A site. It belongs to a 
family of ribosomal proteins which, on the basis of sequence similarities [1,2], groups: 
Eubacterial SI 4. 
Algal and plant chloroplast S14. 
10 - CyanelleSM. 

Archaebacterial Methanococcus vannielii S14. 

- Plant mitochondrial S14. 

- Yeast mitochondrial MRP2. 
Mammalian S29. 

15 - Yeast YS29A/B. 

S14 is a protein of 53 to 115 amino-acid residues. Our signature pattern is based on 
the few conserved positions located in the center of these proteins. 

Consensus pattern: [RP]-x(0,l)-C-x(ll,12)-[LIVMF]-x-[LIVMF]-[SC]-[RG]-x(3)-[RN] 

20 

[1] Chan Y.-L., Suzuki K., Olvera J., Wool I.G. Nucleic Acids Res. 21:649-655(1993). 
[2] Otaka E., Hashimoto T., Mizuta K. Protein Seq. Data Anal. 5:285-300(1993). 

25 543. Ribosomal protein 815 signature 

Ribosomal protein S15 is one of the proteins from the small ribosomal subunit. 
In Escherichia coli, this protein binds to 168 ribosomal RNA and functions at 
early steps in ribosome assembly. It belongs to a family of ribosomal proteins 
which, on the basis of sequence similarities [1,2], groups: - Eubacterial 815. 

3 0 - Archaebacterial Halobacterium marismortui HmaS15 (HSll). 

- Plant chloroplast 815. - Yeast mitochondrial 828. - Mammalian 813. 

- Brugia pahangi and Wuchereria bancrofti 813 (815). - Yeast S13 (YS15). 
S15 is a protein of 80 to 250 amino-acid residues. 
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A conserved region located in the C-terminal part of tliese proteins has been 
selected as a signature pattern. 

-Consensus pattern: [LIVM]-x(2)-H-[LIVMFY]-x(5)-D-x(2)-[SAGN]-x(3)-[LF]-x(9)- 

[LIVM]-x(2)-[FY] 
[ 1] Dang H., Ellis S.R. 

Nucleic Acids Res. 18:6895-6901(1990). 
[ 2] Otaka E., Hashimoto T., Mizuta K. 
Protein Seq. Data Anal. 5:285-300(1993). 



544. Ribosomal protein S16 signature 

Ribosomal protein S16 is one of the proteins from the small ribosomal subunit. It 
belongs to a family of ribosomal proteins which, on the basis of sequence similarities [1], 
groups: 

Eubacterial SI 6. 

- Algal and plant chloroplast S16. 

- Cyanelle S16. 

- Neurospora crassa mitochondrial S24 (cyt-21). 

S16 is a protein of about 100 amino-acid residues. A conserved region located in the 
N-terminal extremity of these proteins has been selected as a signature pattern. 

Consensus pattern: [LIVMT]-x-[LIVM]-[KR]-L-[STAK]-R-x-G-[AKR] 

[1] Otaka E., Hashimoto T., Mizuta K. Protein Seq. Data Anal. 5:285-300(1993). 

545. Ribosomal protein S17 signature 

Ribosomal protein S17 is one of the proteins from the small ribosomal subunit. 
In Escherichia coli, 817 is known to bind specifically to the 5' end of 16S 
ribosomal RNA and is thought to be involved in the recognition of termination 
codons. It belongs to a family of ribosomal proteins which, on the basis of 
sequence similarities [1,2,3], groups: - Eubacterial S17. 
- Plant chloroplast S17 (nuclear encoded). - Red algal chloroplast 817. 
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- Cyanelle S17. - Archaebacterial S17. - Mammalian and plant cytoplasmic Sll. 

- Yeast S18a and S18b (RP41; YS12). 

The best conserved regions located in the C-terminal sections of these proteins have 
been selected as a signature pattern. 

-Consensus pattern: G-D-x-[LIV]-x-[LIVA]-x-[QEK]-x-[RK]-P-[LIV]-S 
[ 1] Gantt J.S., Thompson M.D. J. Biol. Chem. 265:2763-2767(1990). 
[ 2] Herfurth E., Hirano H., Wittmann-Liebold B. 

Biol. Chem. Hoppe-Seyler 372:955-961(1991). 
[ 3] Otaka E., Hashimoto T., Mizuta K. 

Protein Seq. Data Anal. 5:285-300(1993). 

546. Ribosomal protein S17e signature 

A number of eukaryotic and archaebacterial ribosomal proteins can be grouped 
on the basis of sequence similarities. One of these families consists of: 

- Vertebrates S17 [1]. - Drosophila S17 [2]. - Neurospora crassa S17 (crp-3). 

- Yeast S17a (RP51A) and S17b (RP51B) [3]. - Methanococcus jannaschii MJ0245. 
These proteins have from 63 (in archebacteria) to 130 to 146 amino acids and 

are highly conserved. A region in the central part of these proteins has been selected 
as a signature. 

-Consensus pattern: A-x-I-x-[ST]-K-x-L-R-N-[KR]-I-A-G-[FY]-x-T-H 

[ 1] Chen I.-T., Roufa D.J. Gene 70:107-116(1988). 

[ 2] Maki C, Rhoads D.D., Stewart M.J., van Slyke B., Denell R.E., 

Roufa D.J. Gene 79:289-298(1989). [ 3] Abovich N., Rosbash M. 

Mol. Cell. Biol. 4:1871-1879(1984). 

547. Ribosomal protein S18 signature 

Ribosomal protein S18 is one of the proteins from the small ribosomal subunit. In 
Escherichia coli, S18 has been involved in aminoacyl-tRNA binding[l]. It appears to be 
situated at the tRNA A-site of the ribosome. It belongs to a family of ribosomal proteins 
which, on the basis of sequence similarities[2], groups: - Eubacterial S18. - Algal and plant 
chloroplast S18. - Cyanelle SlS.As a signature pattern, a conserved region in the central 
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section of the protein lias been selected. This region contains two basic residues which may 
be involved in RNA-binding.- 

Consensus pattern: [IV]-[DY]-Y-x(2)-[LIVMT]-x(2)-[LIVM]-x(2)-[FYT]-[LIVM]- [ST]- 
[DERP]-x-[GY]-K-[LIVM]-x(3)-R-[LIVMAS]- 

[ 1] McDougall J., Choli T., Kruft V., Kapp U., Wittmann-Liebold B. FEBS Lett. 245:253- 
260(1989).[ 2] Otaka E., Hashimoto T., Mizuta K. Protein Seq. Data Anal. 5:285-300(1993). 

548. Ribosomal protein S19 signature 

Ribosomal protein S19 is one of the proteins from the small ribosomal subunit. 
In Escherichia coli, S19 is known to form a complex with S13 that binds 
strongly to 16S ribosomal RNA. S19 belongs to a family of ribosomal proteins 
which, on the basis of sequence similarities [1,2], groups: - Eubacterial S19. 

- Algal and plant chloroplast S19. - Cyanelle S19. - Archaebacterial S19. 

- Plant mitochondrial S19. - Eukaryotic S15 ('rig' protein). 

S19 is a protein of 88 to 144 amino-acid residues. Our signature pattern is 
based on the few conserved positions located in the C-terminal section of 
these proteins. 

-Consensus pattern: [STDNQ]-G-[KRQM]-x(6)-[LIVM]-x(4)-[LIVM]-[GSD]-x(2)-[LF]- 

[GAS]-[DE]-F-x(2)-[ST] 
[ 1] Kitagawa M., Takasawa S., Kikuchi N., Itoh T., Teraoka H., Yamamoto H., 

Okamoto H. FEBS Lett. 283:210-214(1991). 
[ 2] Otaka E., Hashimoto T., Mizuta K. 

Protein Seq. Data Anal. 5:285-300(1993). 

549. Ribosomal protein S19e signature 

A number of eukaryotic and archaebacterial ribosomal proteins can be grouped 
on the basis of sequence similarities [1,2]. One of these families consists 
of: - Mammalian S19. - Drosophila S19. 

- Ascaris lumbricoides S19g (ALEP-1) and S19s. - Yeast YS16 (RP55A and RP55B). 

- Aspergillus S16. - Halobacterium marismortui HS12. 
These proteins have 143 to 155 amino acids. 
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A well conserved stretch of 20 residues in the C-terminal part of these proteins has 
been selected as a signature pattern. 

-Consensus pattern: P-x(6)-[SAN]-x(2)-[LIVMA]-x-R-x-[ALIV]-[LV]-Q-x-L-[EQ] 
[ 1] Etter A., Aboutanos M., Tobler H., Mueller F. 

Proc. Natl. Acad. Sci. U.S.A. 88:1593-1596(1991). 
[ 2] Suzuki K., Olvera J., Wool I.G. Biochimie 72:299-302(1990). 

550. Ribosomal protein S2 signatures 

Ribosomal protein S2 is one of the proteins from the small ribosomal subunit. 
S2 belongs to a family of ribosomal proteins which, on the basis of sequence 
similarities [1,2], groups: - Eubacterial 82. - Algal and plant chloroplast S2. 

- Cyanelle 82. - Archaebacterial 82. 

- Higher eukaryotes P40 (previously thought to be a laminin receptor). 

- Yeast NABl. - Plant mitochondrial S2. - Yeast mitochondrial MRP4. 
S2is a protein of 235 to 394 amino-acid residues. 

Two conserved regions have been selected as signature patterns. One is 
located in the N-terminal section and the other in the central section. 

-Consensus pattern: [LIVMFA]-x(2)-[LIVMFYC](2)-x-[STAC]-[GSTANQEKR]-[STALV]- 
[HY]-[LIVMF]-G 

-Consensus pattern: P-x(2)-[LIVMF](2)-[LIVMS]-x-[GDN]-x(3)-[DENL]-x(3)-[LIVM]- 
x-E-x(4)-[GNQKRH]-[LIVM]-[AP] 
[ 1] Davis S.C., Tzagoloff A., Ellis S.R. 

J. Biol. Chem. 267:5508-5514(1992). 
[ 2] Tohgo A., Takasawa S., Munakata H., Yonekura H., Hayashi N., Okamoto H. 

FEES Lett. 340:133-138(1994). 

551. Ribosomal protein 821 signature 

Ribosomal protein S21 is one of the proteins from the small ribosomal subunit. So far 
S21 has only been found in eubacteria. It is a protein of 55 to 70 amino-acid residues. A 
conserved region in the N-terminal section of the protein has been selected as a signature 
pattern. 
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Consensus pattern: [DE]-x-A-[LIY]-[KR]-R-F-K-[KR]-x(3)-[KR] 
552. Ribosomal protein S21e signature 

A number of eukaryotic ribosomal proteins can be grouped on tiie basis of 
sequence similarities. One of these families consists of: - Mammalian S21 [1]. 

- Caenorhabditis elegans S21 (F37C12.il). - Rice S21 [2]. 

- Yeast S21 (Ys25) [3]. - Fission yeast S28 [4], 
These proteins have 82 to 87 amino acids. 

A perfectly conserved nonapeptide in the N-terminal part of these proteins has 
been selected as a signature pattern. 
-Consensus pattern: L-Y-V-P-R-K-C-S-[SA] 

[ 1] Bhat K.S., Morrison S.G. Nucleic Acids Res. 21:2939-2939(1993). 
[ 2] Nishi R., Hashimoto H., Uchimiya H., Kato A. 

Biochim. Biophys. Acta 1216:113-114(1993).[ 3] Suzuki K., Otaka E. 

Nucleic Acids Res. 16:6223-6223(1988).[ 4] Itoh T., Okata E., Matsui K.A. 

Biochemistry 24:7418-7423(1985). 

553. Ribosomal protein S24e signature 

A number of eukaryotic and archaebacterial ribosomal proteins can be grouped 
on the basis of sequence similarities. One of these families consists of: 

- Vertebrate S24 [1]. - Yeast Rp50. - Mucor racemosus S24 [2]. 

- Halobacterium marismortui HS15 [3]. - Methanococcus jannaschii MJ0394. 
These proteins have 101 to 148 amino acids. 

A well conserved stretch in the central part of these proteins has been selected as 
a signature pattern. 

-Consensus pattern: [FYA]-G-x(2)-[KR]-[STAl-x-G-[FY]-[GA]-x-[LIVM]-Y-[DN]- 
[SDN] 

[ 1] Brown S.J., Jewell A., Maki C.G., Roufa D.J. Gene 91^293-296(1990). 
[ 2] Sosa L., Fonzi W.A., Sypherd P.S. 

Nucleic Acids Res. 17:9319-9331(1989).[ 3] Kimura J., Arndt E., Kimura M. 

FEES Lett. 224:65-70(1987). 
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554. Ribosomal protein S26e signature 

A number of eukaryotic ribosomal proteins can be grouped on the basis of 
sequence similarities. One of these families consists of: - Mammalian S26 [1]. 

- Octopus S26 [2]. - Drosophila S26 (DS31) [3]. - Plant cytoplasmic S26. 
-Fungi S26 [4]. 

These proteins have 114 to 127 amino acids. 

A conserved octapeptide in the central part of these proteins has been selected as 
a signature pattern. 

-Consensus pattern: [YH]-C-V-S-C-A-I-H 

[ 1] Kuwano Y., Nakanishi O., Nabeshima Y., Tanaka T., Ogata K. 

J. Biochem. 97:983-992(1985).[ 2] Zinov'eva R.D., Tomarev S.I. 

Dokl. Akad. Nauk SSSR 304:464-469(1989). 
[ 3] Itoh N., Ohta K., Ohta M., Kawasaki T., Yamashina I. 

Nucleic Acids Res. 17:2121-2121(1989).[ 4] Wu M., Tan H. 

Gene 150:401-402(1994). 

555. Ribosomal protein S28e signature 

A number of eukaryotic and archaebacterial ribosomal proteins can be grouped 
on the basis of sequence similarities. One of these families consists of: 

- Mammalian S28 [1]. - Plant S28 [2]. - Fungi S33 [3]. 

- Methanococcus jannaschii MJ1202. 

These proteins have from 64 to 78 amino acids. 

A highly conserved nonapeptide from the C-terminal extremity of these 

proteins has been selected as a signature pattern. 

-Consensus pattern: E-[ST]-E-R-E-A-R-x-L 

[ 1] Chan Y.-L., Olvera J., Wool I.G. 

Biochem. Biophys. Res. Commun. 179:314-318(1991). 
[ 2] Hwang I., Goodman H.M. Plant Physiol. 102:1357-1358(1993). 
[ 3] Hoekstra R., Ferreira P.M., Bootsman T.C., Mager W.H., Planta R.J. 

Yeast 8:949-959(1992). 
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556. Ribosomal protein S3Ae signature 

A number of eukaryotic and archaebacterial ribosomal proteins can be grouped 
on the basis of sequence similarities. One of these families consists of: 

- Mammalian S3A (was originally known as v-fos transformation effector 
protein). - Caenorhabditis elegans S3A (F56F3.5). 

- Plant cytoplasmic S3 A (CYC07) [1]. - Yeast RplO (PLCl and PLC2). 

- Fission yeast RplO (SpAC13G6.02c). - Methanococcus jannaschii MJ0980. 
These proteins have from 220 to 250 amino acids. 

A conserved stretch in their N-terminal section was selected as a signature pattern. 
-Consensus pattern: [LIV]-x-[GH]-R-[IV]-x-E-x-[SC]-L-x-D-L 
[ 1] Liu J.H., Reid D.M. 
Plant Physiol. 109:338-338(1995). 

557. Ribosomal protein S3 signature 

Ribosomal protein S3 is one of the proteins from the small ribosomal subunit. 
In Escherichia coli, S3 is known to be involved in the binding of initiator 
Met-tRNA. It belongs to a family of ribosomal proteins which, on the basis of 
sequence similarities [1], groups: - Eubacterial S3. 

- Algal and plant chloroplast S3. - Cyanelle S3. - Archaebacterial S3. 

- Plant mitochondrial S3. - Vertebrate S3. - Insect S3. 

- Caenorhabditis elegans S3 (C23G10.3). - Yeast S3 (Rpl3). 
S3 is a protein of 209 to 559 amino-acid residues. 

A conserved region located in the C-terminal section has been selected as a signature pattern. 
-Consensus pattern: [GSTA]-[KR]-x(6)-G-x-[LIVMT]-x(2)-[NQSCH]-x(l,3)-[LIVFCA]- 

x(3)-[LIV]-[DENQ]-x(7)-[LMT]-x(2)-G-x(2)-G 
[ 1] Otaka E., Hashimoto T., Mizuta K. 

Protein Seq. Data Anal. 5:285-300(1993). 



558. Ribosomal protein S4 signature 



Reference No. 



2750-942P 



472 

Ribosomal protein S4 is one of the proteins from the small ribosomal subunit. 
In Escherichia coli, S4 is known to bind directly to 16S ribosomal RNA. 
Mutations in S4 have been shown to increase translational error frequencies. 
It belongs to a family of ribosomal proteins which, on the basis of sequence 
similarities [1,2], groups: - Eubacterial S4. - Algal and plant chloroplast S4. 

- Cyanelle S4. - Archaebacterial S4. - Mammalian S9. - Yeast YSll (SUP45). 

- Marchantia polymorpha mitochondrial S4. - Dictyostelium discoideum rpl024. 

- Yeast protein NAM9 [3]. NAM9 has been characterized as a suppressor for 
ochre mutations in mitochondrial DNA. It could be a ribosomal protein that 
acts as a suppressor by decreasing translation accuracy. 

S4 is a protein of 171 to 205 amino-acid residues (except for NAM9 which is 
much larger). The signature pattern for this protein is based on a conserved 
region located in the central section of these proteins. 

-Consensus pattern: [LIVM]-[DE]-x-R-[LI]-x(3)-[LIVMC]-[VMFYHQ]-[KRT]-x(3)- 

[STAGCVF]-x-[ST]-x(3)-[SAI]-[KR]-x-[LIVMF](2) 
[ 1] Mizuta K., Hashimoto T., Suzuki K.I., Otaka E. 

Nucleic Acids Res. 19:2603-2608(1991). 
[ 2] Otaka E., Hashimoto T., Mizuta K. 

Protein Seq. Data Anal. 5:285-300(1993). 
[ 3] Boguta M., Dmochowska A., Borsuk P., Wrobel K., Gargouri A., Lazowska J., 

Slonimski P., Szczesniak B., Kruszewska A. 

Mol. Cell. Biol. 12:402-412(1992). 

559. Ribosomal protein S4e signature 

A number of eukaryotic and archaebacterial ribosomal proteins can be grouped 
on the basis of sequence similarities. One of these families consists of: 

- Mammalian S4 [1]. Two highly similar isoforms of this protein exist : one 
coded by a gene on chromosome Y, and the other on chromosome X. 

- Plant cytoplasmic S4 [2] - Yeast S7 (YS6). - Archebacterial S4e. 
These proteins have 233 to 264 amino acids. 

A highly conserved stretch of 15 residues in their N-terminal section has 
been selected as a signature pattern. Four positions in this region are positively 
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charged residues. 

-Consensus pattern: H-x-K-R-[LIVMF]-[SANK]-x-P-x(2)-[WY]-x-[LIVM]-x-[KRP] 
[ 1] Fisher E.M., Beer-Romero P., Brown L.G., Ridley A., McNeil J.A., 

Lawrence J.B., Willard H.F., Bieber F.R., Page D.C. 

Cell 63:1205-1218(1990). 
[ 2] Braun H.P., Emmermann M., Mentzel H., Schmitz U.K. 

Biochim. Biophys. Acta 1218:435-438(1994). 

560. Ribosomal protein S5 signature 

Ribosomal protein S5 is one of the proteins from the small ribosomal subunit. 
In Escherichia coli, S5 is known to be important in the assembly and function 
of the 30S ribosomal subunit. Mutations in S5 have been shown to increase 
translational error frequencies. It belongs to a family of ribosomal proteins 
which, on the basis of sequence similarities [1,2], groups: - Eubacterial S5. 

- Cyanelle S5. - Red algal chloroplast S5. - Archaeb acted al S5. 

- Mammalian S2 (LLrep3). - Caenorhabditis elegans S2 (C49H3.11). 

- Drosophila S2. - Plant S2. - Yeast S4 (SUP44). - Fungi mitochondrial S5. 
S5 is a protein of 166 to 254 amino-acid residues. The signature pattern for 
this protein is based on a conserved region, rich in glycine residues, and 
located in the N-terminal section of these proteins. 

-Consensus pattern: G-[KRQ]-x(3)-[FY]-x-[ACV]-x(2)-[LIVMA]-[LIVM]-[AG]-[DN]- 
x(2)-G-x-[LIVM]-G-x-[SAG]-x(5,6)-[DEQ]-[LIVMA]-x(2)-A- 
[LIVMF] 

[ 1] All-Robyn J.A., Brown N., Otaka E., Liebman S.W. 
Mol. Cell. Biol. 10:6544-6553(1990).[ 2] Otaka E., Hashimoto T., Mizuta K. 
Protein Seq. Data Anal. 5:285-300(1993). 

561. Ribosomal protein S6 signature 

Ribosomal protein S6 is one of the proteins from the small ribosomal subunit. 
In Escherichia coli, S6 is known to bind together with S18 to 16S ribosomal 
RNA. It belongs to a family of ribosomal proteins which, on the basis of 
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sequence similarities, groups: - Eubacterial S6. - Red algal chloroplast S6. 

- Cyanelle S6. 

S6 is a protein of 95 to 208 amino-acid residues. The signature pattern for 
this protein is based on a conserved region located in the N-terminal section 
5 of these proteins. 

-Consensus pattern: G-x-[KRC]-[DENQRH]-L-[SA]-Y-x-I-[KRNSA] 

562. Ribosomal protein S6e signature 
10 A number of eukaryotic and archaebacterial ribosomal proteins can be grouped 
on the basis of sequence similarities. One of these families consists of: 

- Mammalian S6 [1]. - Drosophila S6 [2]. - Plant S6 [3]. - Yeast SIO (YS4). 

- Halobacterium marismortui HS13 [4]. - Methanococcus jannaschii MJ1260. 
S6 is the major substrate of protein kinases in eukaryotic ribosomes [5]; it 

1 5 may have an important role in controlling cell growth and proliferation 

through the selective translation of particular classes of mRNA. 

These proteins have 135 to 249 amino acids. 

A conserved stretch of 12 residues in the N-terminal part of these 

proteins has been selected as a signature pattern. 
2 0 -Consensus pattern: [LIVM]-[STAMR]-G-G-x-D-x(2)-G-x-P-M 

[ 1] Franco R., Rosenfeld M.G. J. Biol. Chem. 265:4321-4325(1990). 

[ 2] Watson K.L., Konrad K.D., Woods D.F., Bryant P.J. 
Proc. Natl. Acad. Sci. U.S.A. 89:11302-11306(1992). 

[ 3] Hansen G., Estruch J.J., Spena A. 
2 5 Nucleic Acids Res. 20:5230-5230(1992). 

[ 4] Kimura M., Aindt E., Hatakeyama T., Hatakeyama T., Kimura J. 
Can. J. Microbiol. 35:195-199(1989). 

[ 5] Bandi H.R., Ferrari S., Krieg J., Meyer H.E., Thomas G. 
J. Biol. Chem. 268:4530-4533(1993). 

30 



563. Ribosomal protein S7 signature 

Ribosomal protein S7 is one of the proteins from the small ribosomal subunit. 
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In Escherichia coli, S7 is known to bind directly to part of the 3'end of 16S 
ribosomal RNA. It belongs to a family of ribosomal proteins which, on the 
basis of sequence similarities [1,2,3], groups: - Eubacterial S7. 

- Algal and plant chloroplast S7. - Cyanelle S7. - Archaebacterial S7. 

- Plant mitochondrial S7. - Mammalian S5. - Plant S5. 

- Caenorhabditis elegans S5 (T05E11.1). 

The best conserved region located in the N-terminal section of these proteins has 
been selected as a signature pattern. 

-Consensus pattern: [DENSK]-x-[LIVMDET]-x(3)-[LIVMFTA](2)-x(6)-G-K-[KR]-x(5)- 

[LIVMF]-[LIVMFC]-x(2)-[STAC] 
[ 1] Klussmann S., Franke P., Bergmann U., Kostka S., Wittmann-Liebold B. 

Biol. Chem. Hoppe-Seyler 374:305-312(1993). 
[ 2] Otaka E., Hashimoto T., Mizuta K. 

Protein Seq. Data Anal. 5:285-300(1993). 
[ 3] Ignatovich O., Cooper M., Kulesza H.M., Beggs J.D. 

Nucleic Acids Res. 23:4616-4619(1995). 

564. Ribosomal protein S7e signature 

A number of eukaryotic ribosomal proteins can be grouped on the basis of sequence 
similarities [1]. One of these families consists of: 

Mammalian S7. 

Xenopus S8. 

Insect S7. 

- Yeast probable ribosomal protein S7 (N2212). 

- Fission yeast probable ribosomal protein S7 (SpAC18G6.13c). 

These proteins have about 200 amino acids. A highly conserved stretch of 14 residues which 
is located in the central section and which is rich in charged residues was selected as a 
signature pattern. 



Consensus pattern: [KR]-L-x-R-E-L-E-K-K-F-[SAP]-x-[KR]-H 
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[1] Salazar C.E., Mills-Hamm D.M., Kumar V., Collins F.H. Nucleic Acids Res. 21:4147- 
4147(1993). 



5 565. Ribosomal protein S8 signature 

Ribosomal protein S8 is one of the proteins from the small ribosomal subunit. 
In Escherichia coli, S8 is known to bind directly to 16S ribosomal RNA. It 
belongs to a family of ribosomal proteins which, on the basis of sequence 
similarities [1], groups: - Eubacterial S8. - Algal and plant chloroplast SB. 
10 - Cyanelle SB. - Archaebacterial SB. - Marchantia polymorpha mitochondrial SB. 
- Mammalian S15A. - Plant 815 A. - Yeast S22 (S24). 

The best conserved region located in the C-terminal section of these proteins 
has been selected as a signature pattern. 

-Consensus pattern: [GE]-x(2)-[LIV](2)-[STY]-[ST]-x(2)-G-[LIVM](2)-x(4)-[AG]- 
15 [KRHAYI] 

[ 1] Otaka E., Hashimoto T., Mizuta K. 
Protein Seq. Data Anal. 5:285-300(1993). 



20 566. Ribosomal protein S8e signature 

A number of eukaryotic and archaebacterial ribosomal proteins can be grouped 
on the basis of sequence similarities [1]. One of these families consists of: 

- Mammalian 88. - Caenorhabditis elegans S8 (F42C5.8). - Leishmania major SB. 

- Plant S8. - Yeast 88 (S14) (Rpl9). - Archebacterial S8e, 

2 5 These proteins have either about 220 amino acids (in eukaryotes) or about 125 

amino acids (in archebacteria). A conserved stretch which is located in the 
N-terminal section and which is rich in positively charged residues has 
been selected as a signature pattern. 

-Consensus pattern: [KR]-x(2)-[ST]-G-[GA]-x(5)-[HR]-[KG]-[KR]-x-K-x-E-[LM]-G 

3 0 [1] Engemann S., Herfurth E., Briesemeister U., Wittmann-Lieboid B. 

J. Protein Chem. 14:189-195(1995). 
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567. Ribosomal protein S9 signature 

Ribosomal protein S9 is one of the proteins from the small ribosomal subunit. 
It belongs to a family of ribosomal proteins which, on the basis of sequence 
similarities [1,2], groups: - Eubacterial S9. - Algal chloroplast S9. 
5 - Cyanelle S9. - Archaebacterial S9. - Mammalian S16. - Plant S16. 

- Yeast mitochondrial ribosomal S9. 

A conserved region containing many charged residues and located in the 
central section of these proteins has been selected as a signature pattern. 
-Consensus pattern: G-G-G-x(2)-[GSA]-0-x(2)-[SA]-x(3)-[GSA]-x-[GSTAV]-[KR]- 
10 [GSAL]-[LIF] 

[ 1] Chan Y.-L., Paz V., Olvera J., Wool I.G. FEES Lett. 263:85-88(1990). 
[ 2] Otaka E., Hashimoto T., Mizuta K. 
Protein Seq. Data Anal. 5:285-300(1993). 

15 

568. Ribulose-phosphate 3-epimerase family signatures 

Ribulose-phosphate 3-epimerase (EC 5.1.3.1) (also known as pentose-5-phosphate 
3-epimerase or PPE) is the enzyme that converts D-ribulose 5-phosphate into 
D-xylulose 5-phosphate in Calvin's reductive pentose phosphate cycle. In 
2 0 Alcaligenes eutrophus two copies of the gene coding for PPE are known [1], 
one is chromosomally encoded (cbbEC), the other one is on a plasmid (cbbeP). 
PPE has been found in a wide range of bacteria, archebacteria, fungi and 
plants. The sequence of PPE is highly related to: 

- Escherichia coli D-alluIose-6-phosphate 3-epimerase (gene alsE). 

2 5 - Escherichia coli protein sgcE. 

- Mycoplasma genitalium hypothetical protein MG112. 

All these proteins have from 209 to 241 amino acid residues. 

Two conserved regions which are located respectively in the N-terminal and in the 

central part of these proteins have been selected as signature patterns. 

3 0 -Consensus pattern: [LIVMF]-H-[LIVMFY]-D-[LIVM]-x-D-x(l,2)-[FY]-[LIVM]-x-N-x- 

[STAV] 

-Consensus pattern: [LIVMA]-x-[LIVM]-M-[ST]-[VS]-x-P-x(3)-G-Q-x-F-x(6)-[NK]- 
[LIVMC] 
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[ 1] Kusian B., Yoo J.G., Bednarski R., Bowien B. 
J. Bacteriol. 174:7337-7344(1992). 

5 569. (Ricin B lectin) Similarity to lectin domain of ricin beta-chain, 3 copies. 

This family consists of a triplicated domain involved in 
cell agglutination in ricin. 

10 

570. (Rotamase) PpiC-type peptidyl-prolyl cis-trans isomerase signature 
Peptidyl-prolyl cis-trans isomerase (EC 5.2.1.8) (PPIase or rotamase) is an 
enzyme that accelerates protein folding by catalyzing the cis-trans 
isomerization of proline imidic peptide bonds in oligopeptides [1]. Most 

1 5 characterized PPiases belong to two families, the cyclophilin-type (see 

<PDOC00154>) and the the FKBP-type (see <PDOC00426>). Recently a third family 
has been discovered [2,3]. So far, the only biochemically characterized 
member of this family is the Escherichia coli protein parvulin (gene ppiC), a 
small (92 residues) cytoplasmic enzyme that prefers amino acid residues with 

2 0 hydrophobic side chains like leucine and phenylalanine in the PI position of 
the peptides substrates. PpiC is evolutionary related to a number of proteins 
that are also probably PPiases: 

- Escherichia coli and Haemophilus influenzae ppiD. PpiD is a PPIase which 
contains a periplasmic ppiC-like domain anchored to the inner membrane and 

25 which seems to be involved in the folding of outer membrane proteins. 

- Escherichia coli surA. SurA is a periplasmic protein that contains two 
ppiC-like domains. 

- Nitrogen-assimilating bacteria protein nifM which is involved in the 
activation and stabilization of the iron-component (nifH) of nitrogenase. 

30 -Bacillus subtilis protein prsA, a membrane-bound lipoprotein involved in 
protein export. 

- Lactococcus and lactobacillus protease maturation protein prtM, a membrane- 
bound lipoprotein involved in the maturation of a secreted serine 
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proteinase. - Yeast protein ESSl/PTFl (processing/termination factor 1). 

- Drosophila protein dodo (gene dod). - Mammalian protein PINl, 

- Campylobacter jejuni cell binding factor 2 (CBF2), a secreted antigen. 

- Bacillus subtilis hypothetical protein yacD. 

- Helicobacter pylori hypothetical protein HP0175. 

- A hypothetical slime mold protein. 

A conserved region that contains a serine which could play a role in the catalytic 
mechanism of these enzymes has been selected as a signature pattern. 
-Consensus pattern: F-[GSADEI]-x-[LVAQ]-A-x(3)-[ST]-x(3,4)-[STQ]-x(3,5)-[GER]- 
G-x-[LIVM]-[GS] 
[ 1] Fischer G., Schmid F.X. 

Biochemistry 29:2205-2212(1990). 
[ 2] Rudd K.E., Sofia H.J., Koonin E.V., Plunkett G. Ill, Lazar S., 

Rouviere P.E. Trends Biochem. Sci. 20:14-15(1995). 
[ 3] Rahfeld J.-U., Ruecknagel K.P., Schelbert B., Ludwig B., Hacker J., 

Mann K., Fischer G. FEES Lett. 352:180-184(1994). 

571. (RrnaAD) Ribosomal RNA adenine dimethylases signature 
A number of enzymes responsible for the dimethylation of adenosines if 
ribosomal RNAs (EC 2.1.1.48) have been found [1,2] to be evolutionary related. 
These enzymes are: 

- Bacterial 16S rRNA dimethylase (gene ksgA), which acts in the biogenesis 
of ribosomes by catalyzing the dimethylation of two adjacent adenosines in 
the loop of a conserved hairpin near the 3'-end of 16S rRNA. Inactivation 
of ksgA leads to resistance to the aminoglycoside antibiotic kasugamycin. 

- Yeast 18S rRNA dimethylase (gene DIMl), which is functionally similar to 
ksgA and that dimethylates twin adenosines in the 3'-end of 18S rRNA. 

- Bacterial 'erm' methylases. These enzymes confer resistance to macrolide- 
lincosamide-streptogramin B (MLS) antibiotics - such as erythromycin - by 
dimethylating the adenine residue at position 2058 of 23S rRNA thus 
resulting in a reduced affinity between ribosomes and the MLS antibiotics. 

- Caenorhabditis elegans hypothetical protein E02H1.1. 
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The best conserved regions in these enzymes is located in the N-terminal 
section and corresponds to a region that is probably involved in S-adenosyl 
methionine (SAM) binding. 

-Consensus pattern: [LIVM]-[LIVMFY]-[DE]-x-G-[STAPV]-G-x-[GA]-x-[LIVMF]-[ST]- 

x(2)-[LIVM]-x(6)-[LIVMY]-x-[STAGV]-[LIVMFYHC]-E-x-D 
[ 1] van Gemen B., van Knippenberg P.H. 

(In) Nucleic acid methylation, Clawson G.A., Willis D.B., Weissbach A., 

Jones P.A., Eds., pp.19-36, Alan R. Liss Inc, New- York, (1990). 
[ 2] Lafontaine D., Delcour J., Glasser A.L., Desgres J., Vandenhaute J. 

J. Mol. Biol. 241:492-497(1994). 

572. (RuBisCO small) Ribulose bisphosphate carboxylase, small chain. 206 members 

573. ATP/GTP-binding site motif A (P-loop) (ras) 

From sequence comparisons and crystallographic data analysis it has been shown 
[1,2,3,4,5,6] that an appreciable proportion of proteins that bind ATP or GTP share a number 
of more or less conserved sequence motifs. The best conserved of these motifs is a glycine- 
rich region, which typically forms a flexible loop between a beta-strand and an alpha-helix. 
This loop interacts with one of the phosphate groups of the nucleotide. This sequence motif is 
generally referred to as the 'A' consensus sequence [1] or the 'P-loop' [5]. There are numerous 
ATP- or GTP-binding proteins in which the P-loop is found. A number of protein families for 
which the relevance of the presence of such a motif has been noted are listed below: - ATP 
synthase alpha and beta subunits. - Myosin heavy chains. - Kinesin heavy chains and kinesin- 
like proteins. - Dynamins and dynamin-like proteins - Guanylate kinase - Thymidine kinase (- 
Thymidylate kinase. - Shikimate kinase. - Nitrogenase iron protein family (nifH/frxC) - ATP- 
binding proteins involved in 'active transport' (ABC transporters) [7] - DNA and RNA 
helicases [8,9,10]. - GTP-binding elongation factors (EF-Tu, EF-lalpha, EF-G, EF-2, etc.). - 
Ras family of GTP-binding proteins (Ras, Rho, Rab, Ral, Yptl, SEC4, etc.). - Nuclear protein 
ran. - ADP-ribosylation factors family - Bacterial dnaA protein - Bacterial recA protein - 
Bacterial recF protein - Guanine nucleotide-binding proteins alpha subunits (Gi, Gs, Gt, GO, 
etc.). - DNA mismatch repair proteins mutS family - Bacterial type II secretion system 
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protein E. Not all ATP- or GTP-binding proteins are picked-up by this motif. A number of 
proteins escape detection because the structure of their ATP -binding site is completely 
different from that of the P-loop. Examples of such proteins are the E1-E2 ATPases or the 
glycolytic kinases. In other ATP- or GTP-binding proteins the flexible loop exists in a 
slightly different form; this is the case for tubulins or protein kinases. A special mention must 
be reserved foradenylate kinase, in which there is a single deviation from the P-loop pattern: 
in the last position Gly is found instead of Ser or Thr. 
Consensus pattern: [AG]-x(4)-G-K-[ST] 

In addition to the proteins listed above, the A' motif is also found in a number of other 
proteins. Most of these proteins probably bind a nucleotide, but others are definitively not 
ATP- or GTP-binding (as for example chymotrypsin, or human ferritin light chain). 
[ 1] Walker J.E., Saraste M., Runswick M.J., Gay N.J. EMBO J. 1:945-951(1982).[ 2] Moller 
W., Amons R. FEES Lett. 186:1-7(1985).[ 3] Fry D.C., Kuby S.A., Mildvan AS. Proc. Natl. 
Acad. Sci. U.S.A. 83:907-911(1986).[ 4] Dever T.E., Glynias M.J., Merrick W.C. Proc. Natl. 
Acad. Sci. U.S.A. 84:1814-1818(1987).[ 5] Saraste M., Sibbald P.R., Wittinghofer A. Trends 
Biochem. Sci. 15:430-434(1990).[ 6] Koonin E.V. J. Mol. Biol. 229:1165-1174(1993).[ 7] 
Higgins C.F., Hyde S.C., Mimmack M.M., Gileadi U., Gill D.R., Gallagher M.P. J. Bioenerg. 
Biomembr. 22:571-592(1990).[ 8] Hodgman T.C. Nature 333:22-23(1988) and Nature 
333:578-578(1988) (Errata). [ 9] Linder P., Lasko P., Ashburner M., Leroy P., Nielsen P.J., 
Nishi K., Schnier J., Slonimski P.P. Nature 337:121-122(1989).[10] Gorbalenya A.E., 
Koonin E.V., Donchenko AP., Blinov V.M. Nucleic Acids Res. 17:4713-4730(1989). 

GTP-binding nuclear protein ran signature (ras) 

Ran (or TC4) is a small abundant nuclear protein that binds and hydrolyzes GTP and which 
has been implicated in a large number of processes including nucleocytoplasmic transport, 
RNA synthesis, processing and export and cell cycle checkpoint control [1,2]. Ran is 
generally included in the RAS 'superfamily' of small GTP-binding proteins [3], but it is only 
slightly related to the other RAS proteins. It also differs from RAS proteins in that it lacks 
cysteine residues at its C- terminal and is therefore not subject to prenylation. Instead ran has 
an acidic C-terminus. It is, however similar to RAS family members in requiring a specific 
guanine nucleotide exchange factor (GEE) and a specific GTPase activating protein (GAP) as 
stimulators of overall GTPase activity. The region of the GTP-binding B motif which, in ran, 
is perfectly conserved has been selected as a signature pattern. 
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Consensus pattern: D-T-A-G-Q-E-K-[LF]-G-G-L-R-[DE]-G-Y-Y- Proteins belonging to this 
family also contain a copy of the ATP/GTP- binding motif 'A' (P-loop). 
[ 1] Scheffzek K., Klebe C, Fritz -Wolf K., Kabsch W., Wittinghofer A. Nature 374:378- 
381(1995).[ 2] Rush M.G., Drivas G., d'Eustachio P. BioEssays 18:103-112(1996).[ 3] 
Valencia A., Chardin P., Wittinghofer A., Sander C. Biochemistry 30:4637-4648(1991). 

574. recA signature 

The bacterial recA protein [1,2,3,E1] is essential for homologous recombination and 
recombinational repair of DNA damage. RecA has many activities: it filaments, it binds to 
single- and double-stranded DNA, itbinds and hydrolyzes ATP, it is also a recombinase and, 
finally, it interacts with lexA causing its activation and leading to its autocatalytic cleavage. 
RecA is a protein of about 350 amino-acid residues. Its sequence is very well conserved 
[3,4,5,E1] among eubacterial species. It is also found in the chloroplast of plants [6]. The best 
conserved region, a nonapeptide located in the middle of the sequence which is part of the 
monomer-monomer interface in a recA filament has been selected as a signature pattern,. 
Consensus pattern: A-L-[KR]-[IF]-[FY]-[STA]-[STAD]-[LIVMQ]-R- 
[ 1] Smith K.C., Wang T.-C. V. BioEssays 10:12-16(1989).[ 2] Lloyd A.T., Sharp P.M. J. 
Mol. Evol. 37:399-407(1993). [ 3] Roca A.I., Cox M.M. Prog. Nucleic Acids Res. Mol. Biol. 
56:129-223(1997).[ 4] Karlin S., Weinstock G.M., Brendel V. J. Bacteriol. 177:6881- 
6893(1995).[ 5] Eisen J.A. J. Mol. Evol. 41:1105-1123(1995).[ 6] Cerutti H.D., Osman M., 
Grandoni P., Jagendorf A.T. Proc. Natl. Acad. Sci. U.S.A. 89:8068-8072(1992).[El] 
http://www.tigr.org/~ieisen/RecA/RecA.html 

575. Response regulator receiver domain 

This domain receives the signal from the sensor partner inComment: bacterial two- 
component systems. It is usually found N-terminalComment: to a DNA binding effector 
domain. 

[1] Pao GM, Saier MH; J Mol Evol 1995;40:136-154. 

576. Ribonucleotide reductase large subunit signature 
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*Ribonucleotide reductase (EC L17A1) [1,2] catalyzes the reductive synthesis of 
deoxyribonucleotides from their corresponding ribonucleotides. It provides the precursors 
necessary for DNA synthesis. Ribonucleotide reductase is an oligomeric enzyme composed 
of a large subunit (700 to 1000 residues) and a small subunit (300 to 400 residues). There are 
regions of similarities in the sequence of the large chain from prokaryotes, eukaryotes and 
viruses. One of these regions has been selected as a signature pattern. 
Consensus pattern: W-x(2)-[LF]-x(6,7)-G-[LIVM]-[FYRA]-[NH]-x(3)-[STAQLIVM]- 
[ASC]-x(2)-[PA]- 

[ 1] Nillson O., Lundqvist T., Hahne S., SjobergB.-M. Biochem. See. Trans. 16:91- 
94(1988).[ 2] Reichard P. Science 260:1773-1777(1993). 

577. Ribonuclease T2 family histidine active sites 

The fungal ribonucleases T2 from Aspergillus oryzae, M from Aspergillus saitoiand Rh from 
Rhizopeus niveus are structurally and functionally related 30 Kdglycoproteins [1] that cleave 
the 3'-5' internucleotide linkage of RNA via a nucleotide 2',3'-cyclic phosphate intermediates 
(EC 3.1.27.1 ).A number of other RNAses have been found to be evolutionary related to these 
fungal enzymes: - Self-incompatibility [2] in flowering plants is often controlled by a single 
gene (S-gene) that has several alleles. This gene prevents fertilization by self-pollen or by 
pollen bearing either of the two S- alleles expressed in the style. The self-incompatibility 
glycoprotein from several higher plants of the solanaceae family has been shown [2,3] to be a 
ribonuclease. - Phosphate-starvation induced RNAses LE and LX from tomato [4]. These two 
enzymes are probably involved in a phosphate-starvation rescue system. - Escherichia coli 
periplasmic RNAse I (EC 3.1.27.6) (gene rna) [5]. - Aeromonas hydrophila periplasmic 
RNAse. - Haemophilus influenzae hypothetical protein HI0526.Two histidines residues have 
been shown [6,7] to be involved in the catalytic mechanism of RNase T2 and Rh. These 
residues and the region around them arehighly conserved in all the sequence described above. 
Two signature patterns have been developed, one for each of the two active-site histidines. 
The second pattern also contains a cysteine which is known to be involved in a disulfide 
bond. 

Consensus pattern: [FYWL]-x-[LIVM]-H-G-L-W-P [H is an active site residue] 
Consensus pattern: [LIVMF]-x(2)-[HDGTY]-[EQ]-[FYW]-x-[KR]-H-G-x-C [H is an active 
site residue] [C is involved in a disulfide bond] 
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[ 1] Watanabe H., Naitoh A., Suyama Y., Inokuchi N., Shimada H., Koyama T., Ohgi K., Irie 
M. J. Biochem. 108:303-310(1990).[ 2] Haring V., Gray J.E., McClure B.A., Anderson M.A., 
Clarke A.E. Science 250:937-941(1990).[ 3] McClure B.A., Haring V., Ebert P.R., Anderson 
M.A., Simpson R.J., Sakiyama R, Clarke A.E. Nature 342:95957(1989).[ 4] Loeffler A., 
Glund K., Irie M. Eur. J. Biochem. 214:627-633(1993).[ 5] Meador J. Ill, Kennell D. Gene 
95:1-7(1990).[ 6] Kawata Y., Sakiyama P., Hayashi F., Kyogoku Y. Eur. J. Biochem. 
187:255-262(1990).[ 7] Kurihara H., Mitsui Y., Ohgi K., Irie M., Mizuno H., Nakamura K.T. 
FEES Lett. 306:189-192(1992). 

578. Ribonucleotide reductase large subunit signature. Ribonucleotide reductase (EC 
1.17.4.1) [1,2] catalyzes the reductive synthesis of deoxyribonucleotides from their 
corresponding ribonucleotides. It provides the precursors necessary for DNA synthesis. 
Ribonucleotide reductase is an oligomeric enzyme composed of a large subunit (700 to 1000 
residues) and a small subunit (300 to 400 residues). There are regions of similarities in the 
sequence of the large chain from prokaryotes, eukaryotes and viruses. One of these regions 
has been developed as a signature pattern. 

Consensus pattern: W-x(2)-[LF]-x(6,7)-G-[LIVM]-[FYRA]-[NH]-x(3)-[STAQLIVM]- 
[ASC]-x(2)-[PA]- 

[ 1] Nillson O., Lundqvist T., Hahne S., Sjoberg B.-M. Biochem. Soc. Trans. 16:91- 
94(1988).[ 2] Reichard P. Science 260:1773-1777(1993). 

579. RNase H 

RNase H digests the RNA strand of an RNA/DNA hybrid. Important enzyme in retroviral 
replication cycle, and often found as a domain associated with reverse transcriptases. 
Structure is a mixed alpha+beta fold with three a/b/a layers. 

580. Eukaryotic putative RNA-binding region RNP-1 signature (rrm) 
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Many eukaryotic proteins that are known or supposed to bind single-strandedRNA contain 
one or more copies of a putative RNA-binding domain of about 90amino acids [1,2]. This 
region has been found in the following proteins: Heterogeneous nuclear 
ribonucleoproteins ** - hnRNP Al (helix destabilizing protein) (twice). - hnRNP A2/B1 
(twice). - hnRNP C (C1/C2) (once). - hnRNP E (UP2) (at least once). - hnRNP G (once). ** 
Small nuclear ribonucleoproteins ** - Ul snRNP 70 Kd (once). - Ul snRNP A (once). - U2 
snRNP B" (once). ** Pre-RNA and mRNA associated proteins ** - Protein synthesis 
initiation factor 4B (eIF-4B) [3], a protein essential for the binding of mRNA to ribosomes 
(once). - Nucleolin (4 times). - Yeast single-stranded nucleic acid-binding protein (gene 
SSBl) (once). - Yeast protein NSRl (twice). NSRl is involved in pre-rRNA processing; it 
specifically binds nuclear localization sequences. - Poly(A) binding protein (PABP) (4 
times). ** Others ** - Drosophila sex determination protein Sex-lethal (Sxl) (twice). - 
Drosophila sex determination protein Transformer-2 (Tra-2) (once). - Drosophila 'elav' 
protein (3 times), which is probably involved in the RNA metabolism of neurons. - Human 
paraneoplastic encephalomyelitis antigen HuD (3 times) [4], which is highly similar to elav 
and which may play a role in neuron-specific RNA processing. - Drosophila 'bicoid' protein 
(once) [5], a segment-polarity homeobox protein that may also bind to specific mRNAs. - La 
antigen (once), a protein which may play a role in the transcription of RNA polymerase III. - 
The 60 Kd Ro protein (once), a putative RNP complex protein. - A maize protein induced by 
abscisic acid in response to water stress, which seems to be a RNA-binding protein. - Three 
tobacco proteins, located in the chloroplast [6], which may be involved in splicing and/or 
processing of chloroplast RNAs (twice). - X16 [7], a mammalian protein which may be 
involved in RNA processing in relation with cellular proliferation and/or maturation. - 
Insulin-induced growth response protein Cl-4 from rat (twice). - Nucleolysins TIA-1 and 
TIAR (3 times) [8] which possesses nucleolytic activity against cytotoxic lymphocyte target 
cells, may be involved in apoptosis. - Yeast RNA15 protein, which plays a role in mRNA 
stability and/or poly-(A) tail length [9]. Inside the putative RNA-binding domain there are two 
regions which are highly conserved. The first one is a hydrophobic segment of six residues 
(which is called the RNP -2 motif), the second one is an octapeptide motif (which is called 
RNP-1 or RNP-CS). The position of both motifs in the domain is shown in the following 
schematic representation: 
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xxxxxxx######xxxxxxxxxxxxxxxxxxxxxxxxxxxxx########xxxxxxxxxxxxxxxxxxxxxxxxx 
RNP-2 RNP-1 

The RNP-1 motif has been used as a signature pattern for this type of domain. 

Consensus pattern: [RK]-G-{EDRKHPCG}-[AGSCI]-[FY]-[LIVA]-x-[FYLM] In most cases 

the residue in position 3 of the pattern is either Tyr or Phe. 

[ 1] Bandziulis R.J., Swanson M.S., Dreyfuss G. Genes Dev. 3:431-437(1989).[ 2] Dreyfuss 

G. , Swanson M.S., Pinol-Roma S. Trends Biochem. Sci. 13:86-91(1988).[ 3] Milburn S.C., 
Hershey J.W.B., Davies M.V., Kelleher K., Kaufman R.J. EMBO J. 9:2783-2790(1990).[ 4] 
Szabo A., Dalmau J., Manley G., Rosenfeld M., Wong E., Henson J., Posner J.B., Furneaux 

H. M. Cell 67:325-333a991). r 5] Rebagliati M. Cell 58:231-232(1989). r 6] Li Y., Sugiura M. 
EMBO J. 9:3059-3066(1990).[ 7] Ayane M., Preuss U., Koehler G., Nielsen P.J. Nucleic 
Acids Res. 19:1273-1278(1991).[ 8] Kawakami A., Tian Q., Duan X., Streuli M., Schlossman 
S.F., Anderson P. Proc. Natl. Acad. Sci. U.S.A. 89:8681-8685(1992).[ 9] Minvielle-Sebastia 
L., Winsor B., Bonneaud N., Lacroute F. Mol. Cell. Biol. 11:3075-3087(1991). 

581. Rubredoxin signature 

Rubredoxins [1] are small electron-transfer prokaryotic proteins. They contain an iron 
atom which is ligated by four cysteine residues. Rubredoxins are, in some cases, 
functionally interchangeable with ferredoxins. 

A conserved region that includes two of the cysteine residues that bind the iron atom 
has been selected as a pattern for these proteins. 

Consensus pattern: [LIVM]-x(3)-W-x-C-P-x-C-[AGD] [The two Cs bind the iron 

atom] 

In Pseudomonas oleovorans rubredoxin 2 (gene alkG) [2], this pattern is found twice because 
alkG has two rubredoxin domains. 

Rubrerythrin [3], a protein with inorganic pyrophosphatase activity from Desulfovibrio 
vulgaris possesses a C-terminal rubredoxin-like domain, but this domain is too divergent to 
be detected by the above pattern. 

[ 1] Berg J.M., Holm R.H.(In) Iron-sulfur proteins. Spire T.G., Ed., ppl-66, Wiley, 
New-York, (1982). [ 2] Kok M., Oldenhuis R., der Linden M.P.G., Meulenberg C.H.C., 
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Kingma J., Witholt B., J. Biol. Chem. 264:5442-5451(1989). [ 3] van Beeumen J.J., van 
Driessche G., Liu M.-Y., Le Gall J., J. Biol. Chem. 266:20645-20653(1991). 

582. (rvp) Eukaryotic and viral aspartyl proteases active site 

Aspartyl proteases, also known as acid proteases, (EC 3.4.23.-) are a widely distributed 
family of proteolytic enzymes [1,2,3] known to exist invertebrates, fungi, plants, retroviruses 
and some plant viruses. Aspartate proteases of eukaryotes are monomeric enzymes which 
consist of two domains. Each domain contains an active site centered on a catalytic aspartyl 
residue. The two domains most probably evolved from the duplication of an ancestral gene 
encoding a primordial domain. Currently known eukaryotic aspartyl proteases are: - 
Vertebrate gastric pepsins A and C (also known as gastricsin). - Vertebrate chymosin 
(rennin), involved in digestion and used for making cheese. - Vertebrate lysosomal cathepsins 
D (EC 3.4.23.5) and E (EC 3.4.23.34 ). - Mammalian renin (EC 3.4.23.15 ) whose function is 
to generate angiotensin I from angiotensinogen in the plasma. - Fungal proteases such as 
aspergillopepsin A (EC 3.4.23.18 ), candidapepsin (EC 3.4.23.24 ), mucoropepsin (EC 
3.4.23.23 ) (mucor rennin), endothiapepsin (EC 3.4.23.22 ), polyporopepsin (EC 3.4.23.29 ), 
and rhizopuspepsin (EC 3.4.23.21 ). - Yeast saccharopepsin (EC 3.4.23.25 ) (proteinase A) 
(gene PEP4). PEP4 is implicated in posttranslational regulation of vacuolar hydrolases. - 
Yeast barrier pepsin (EC 3.4.23.35 ) (gene BARl); a protease that cleaves alpha-factor and 
thus acts as an antagonist of the mating pheromone. - Fission yeast sxal which is involved in 
degrading or processing the mating pheromones. Most retroviruses and some plant viruses, 
such as badnaviruses, encode for anaspartyl protease which is an homodimer of a chain of 
about 95 to 125 amino acids. In most retroviruses, the protease is encoded as a segment of a 
polyprotein which is cleaved during the maturation process of the virus. It is generally part of 
the pol polyprotein and, more rarely, of the gagpolyprotein. Conservation of the sequence 
around the two aspartates of eukaryotic aspartyl proteases and around the single active site of 
the viral proteases allows us to develop a single signature pattern for both groups of protease. 
Consensus pattern: [LIVMFGAC]-[LIVMTADN]-[LIVFSA]-D-[ST]-G-[STAV]- 
[STAPDENQ]- x-[LIVMFSTNC]-x-[LIVMFGTA] [D is the active site residue] - 
[ 1] Foltmann B. Essays Biochem. 17:52-84(1981).[ 2] Davies D.R. Annu. Rev. Biophys. 
Chem. 19:189-215(1990).[ 3] Rao J.K.M., Erickson J.W., Wlodawer A. Biochemistry 
30:4663-4671(1991).[ 4] Rawlings N.D., Barrett A.J. Meth. Enzymol. 248:105-120(1995). 
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583. (rvt) Reverse transcriptase (RNA-dependent DNA polymerase) 

A reverse transcriptase gene is usually indicative of a mobile element such as a 
retrotransposon or retrovirus. Reverse transcriptases occur in a variety of mobile elements, 
including retrotransposons, retroviruses, group II introns, bacterial msDNAs, hepadnaviruses, 
and caulimoviruses. Number of members: 1233 

[1] Medline: 91006031. Origin and evolution of retroelements based upon their reverse 
transcriptase sequences. Xiong Y, Eickbush TH; EMBO J 1990;9:3353-3362. 

584. (S-AdoMet synt) S-adenosylmethionine synthetase signatures 

S-adenosylmethionine synthetase (EC 2.5.1.6 ) is the enzyme that catalyzes theformation of S- 
adenosylmethionine (AdoMet) from methionine and ATP [1]. AdoMet is an important methyl 
donor for transmethylation and is also the propylamino donor in polyamine biosynthesis. In 
bacteria there is a single isoform of AdoMet synthetase (gene metK), there are two in 
budding yeast (genes SAMl and SAM2) and in mammals while in plants there is generally a 
multigene family .The sequence of AdoMet synthetase is highly conserved throughout 
isozymes and species. Two signature patterns have been selected for this type of enzyme; the 
first is a hexapeptide which seems to be involved in ATP-binding; the second is an almost 
perfectly conserved glycine-rich nonapeptide. 

Consensus pattern: G-A-G-D-Q-G-x(3)-G-[FYH]-Sequences known to belong to this class 
detected by the pattern: 

Consensus pattern: G-[GA]-G-[ASC]-F-S-x-K-[DE] 

[ 1] Horikawa S., Sasuga J., Shimizu K., Ozasa H., Tsukada K. J. Biol. Chem. 265:13683- 
13686(1990). 

585. SI RNA binding domain 

The SI domain occurs in a wide range of RNAComment: associated proteins. It is 
structurally similarComment: to cold shock protein which binds nucleic acids.Comment: The 
SI domain has an OB-fold structure. 



Reference No. 



2750-942P 



489 

[1] Bycroft M, Hubbard TJ, Proctor M, Freund SM, Murzin AG; Cell 1997;88:235-242. 
586. SAICAR synthetase signatures 

Phosphoribosylaminoimidazole-succinocarboxamide synthase (EC 6.3.2.6) 
(SAICARsynthetase) catalyzes the seventh step in the de novo purine biosynthetic pathway; 
the ATP-dependent conversion of 5'-phosphoribosyl-5-aminoimidazole-4-carboxylic acid and 
aspartic acid to SAICAR [1]. In bacteria (gene purC),fungi (gene ADEl) and plants, 
SAICAR synthetase is a monofunctional protein;in higher vertebrates it is the N-terminal 
domain of a bifunctional enzyme that also catalyze phosphoribosylaminoimidazole 
carboxylase (AIRC) activity. Two conserved regions in the central section of this enzyme 
have been selected as signature patterns for SAICAR synthetase. 

Consensus pattern: [LIVMF](2)-P-[LIVM]-E-x-[LIVM]-[LIVMCA]-R-x(3)-[TA]-G-S- 

Consensus pattern: [LIVM]-[LIVMA]-D-x-K-[LIVMFY]-E-F-G 

[ 1] Zalkin H., Dixon J.E. Prog. Nucleic Acid Res. Mol. Biol. 42:259-287(1992). 

587. (SCP) Extracellular proteins SCP/Tpx-l/Ag5/PR-l/Sc7 signatures 
A variety of extracellular proteins from eukaryotes have been found to be evolutionary 
related: - Rodent sperm-coating glycoprotein (SCP), also known as acidic epididymal 
glycoprotein (AEG) . This protein is thought to be involved in sperm maturation [1]. It is a 
protein of about 220 residues and probably contains eight disulfide bonds. - Mammalian 
testis-specific protein Tpx-1 [2]. Tpx-1 is highly related to SCP's. - Mammalian glioma 
pathogenesis-related protein (GliPR). - Lizard helothermine, a toxin that blocks ryanodine 
receptors. - Venom allergen 5 (Ag5) from vespid wasps and venom allergen 3 (Ag3) from 
fire ants. These proteins are potent allergens and are the main cause of allergic reactions to 
stings from insects of the hymenoptera family [3]. Ag5/3 are proteins of about 200 residues 
and contain four disulfide bonds. - Plant pathogenesis proteins of the PR-1 family [4]. These 
proteins are synthesized during pathogen infection or other stress-related responses. They are 
proteins of about 130 to 140 residues and probably contain three disulfide bonds. - Proteins 
Sc7 and Scl4 from the basidomycete fungus Schizophyllum commune. These extracellular 
proteins are loosely associated with fruit body hyphal walls [5]. Sc7/14 are proteins of about 
180 residues and probably contain two disulfide bonds. - Ancylostoma secreted protein from 
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dog hookworm. - Yeast hypothetical proteins YJL078c, YJL079c and YKROlSw.The exact 

function of these proteins is not yet known. Two conserved regions located in their C- 

terminal half have been selected as signature patterns. The second signature contains a 

cysteine which is known to be involved in a disulfide bond in Ag5. 

Consensus pattern: [GDER]-H-[FYWH]-T-Q-[LIVM](2)-W-x(2)-[STN] 

Consensus pattern: [LIVMFYH]-[LIVMFY]-x-C-[NQRHS]-Y-x-[PARH]-x-[GL]-N- 

[LIVMFYWDN] [C is involved in a disulfide bond] 

[ 1] Mizuki N., Kasahara M. Mol. Cell. Endocrinol. 89:25-32(1992).[ 2] Kasahara M., 
Gutknecht J., Brew K., Spurr N., Goodfellow P.N. Genomics 5:527-534(1989).[ 3] Lu G., 
Villalba M., Coscia M.R., Hoffman D.R., King T.P. J. Immunol. 150:2823-2830(1993).[ 4] 
Dixon D.C., Cutt J.R., Klessig D.F. EMBO J. 10:1317-1324(1991).[ 5] Schuren F.H.J., 
Asgeirsdottir S.A., Kothe E.M., Scheer J.M.J., Wessels J.G.H. J. Gen. Microbiol. 139:2083- 
2090(1993). 

588. SET domain 

SET domains appear to be protein-protein interactionComment: domains. It has been 
demonstrated that SET domainsComment: mediate interactions with a family of proteins 
thatComment: display similarity with dual-specificity phosphatasesComment: (dsPTPases) 
[2]. 

[1] Tripoulas N, LaJeunesse D, Gildea J, Shearn A; Genetics 1996;143:913-928. [2] Cui X, 
De Vivo I, Slany R, Miyamoto A, Firestein R, Cleary, ML; Nat Genet 1998;18:331-337. 

589. Src homology 3 (SH3) domain profile 

The Src homology 3 (SH3) domain is a small protein domain of about 60 amino-acid residues 
first identified as a conserved sequence in the non-catalytic part of several cytoplasmic 
protein tyrosine kinases (e.g. Src, Abl, Lck) [l].Since then, it has been found in a great 
variety of other intracellular or membrane-associated proteins [2,3,4,5].The SH3 domain has 
a characteristic fold which consists of five or six beta-strands arranged as two tightly packed 
anti-parallel beta sheets. The linker regions may contain short helices [6] .The function of the 
SH3 domain is not well understood. The current opinion is that they mediate assembly of 
specific protein complexes via binding to proline-rich peptides [7]. In general SH3 domains 
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are found as single copies in a given protein, but there is a significant number of protein with 
two SH3 domains and a few with 3 or 4 copies. So far, SH3 domains have been identified in 
the following proteins: - Many vertebrate, invertebrate and retroviral cytoplasmic (non- 
receptor) protein tyrosine kinases. In particular in the Src, Abl, Bkt, Csk and ZAP70 families 
of kinases. - Mammalian phosphatidylinositol-specific phospholipase C-gamma-1 and -2. - 
Mammalian phosphatidyl inositol 3 -kinase regulatory p85 subunit. - Mammalian Ras 
GTPase-activating protein (GAP). - Adaptor proteins mediating binding of guanine 
nucleotide exchange factors to growth factor receptors: vertebrate GRB2, Caenorhabditis 
elegans sem-5 and Drosophila DRK. All of which have two SH3 domains. - Mammalian Vav 
oncoprotein, a guanine nucleotide exchange factor of the CDC24 family. - Some guanine- 
nucleotide releasing factors of the CDC25 family: yeast CDC25, yeast SCD25, fission yeast 
ste6. - MAGUK proteins. These proteins consist of at least three types of domains: one or 
more copies of the DHR domain, a SH3 domain and a C-terminal guanylate kinase domain. 
Members of this family are: Drosophila lethal(l)discs large-1 tumor suppressor protein (gene 
Dlgl), mammalian tight junction protein ZO-1, vertebrate erythrocyte membrane protein p55, 
Caenorhabditis elegans protein lin-2, rat protein CASK and mammalian synaptic proteins 
SAP90/PSD-95, CHAPS YN-llO/PSD-93, SAP97/DLG1 and S API 02. - Miscellanous 
proteins interacting with vertebrate receptor protein tyrosine kinases: mammalian 
cytoplasmic protein Nek (3 copies), oncoprotein Crk (2 copies). - Chicken Src substrate 
p80/85 protein (cortactin) and the similar human hemopoietic lineage cell specific protein 
Hsl. - Mammalian dihydrouridine-sensitive L-type calcium channel beta (regulatory) subunit 
including the related human myasthenic syndrome antigen B (MSYB). - Mammalian 
neutrophil cytosolic activators of NADPH oxidase: p47 (NCF-1), p67 (NCF-2), and a 
potential homolog from Caenorhabditis elegans (B0303.7). NCF-1 and -2 have two copies of 
the SH3 domain, while B0303.7 has four. - Some myosin heavy chains from amoebae, slime 
molds and yeast (gene MY03). - Vertebrate and Drosophila spectrin and fodrin alpha-chain. - 
Human amphiphysin. - Yeast actin-binding protein ABPl. - Yeast actin-binding protein 
SLAl (3 copies). - Yeast protein BEMl and the fission yeast homolog scd2 (or ral3) (2 
copies). - Yeast BEMl-binding proteins BOO (BEBl) and BOBl (BOH). - Yeast fusion 
protein FUSl. - Yeast protein RSV167. - Yeast protein SSU81. - Yeast hypothetical proteins 
YAR014C (1 copy), YFR024c (1 copy), YHL002w (1 copy), YHR016c (1 copy), YJL020C 
(1 copy), YHR114w (2 copies) and the fission yeast homolog SpAC12C2.05c. - 
Caenorhabditis elegans hypothetical proteins F42H10.3. The profile developed to detect SH3 
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domains is based on a structural alignment consisting of 5 gap-free blocks and 4 linker 
regions totaling 62 match positions. 

[ 1] Mayer B.J., Hamaguchi M., Hanafusa H. Nature 332:272-275(1988).[ 2] Musacchio A., 
Gibson T., Lehto V.P., Saraste M. FEES Lett. 307:55-61(1992). [ 3] Pawson T., Schlessinger 
J. Curr. Biol. 3:434-442(1993).[ 4] Mayer B.J., Baltimore D. Trends Cell Biol. 3:8-13(1993).[ 
5] Pawson T. Nature 373:573-580(1995).[ 6] Kuriyan J., Cowburn D. Curr. Opin. Struct. 
Biol. 3:828-837(1993).[ 7] Morton C.J., Campbell I.D. Curr. Biol. 4:615-617(1994). 

590. Serine hydroxymethyltransferase pyridoxai-phosphate attachment site (SHMT) 
Serine hydroxymethyltransferase (EC 2.1.2.1 ) (SHMT) [1] catalyzes the transfer of the 
hydroxymethyl group of serine to tetrahydrofolate to form 5,10-methylenetetrahydrofolate 
and glycine. In vertebrates, it exists in acytoplasmic and a mitochondrial form whereas only 
one form is found in prokaryotes. Serine hydroxymethyltransferase is a pyridoxai-phosphate 
containing enzyme. The pyridoxal-P group is attached to a lysine residue around which the 
sequence is highly conserved in all forms of the enzyme. 

Consensus pattern: [DEH]-[LIVMFY]-x-[STMV]-[GST]-[ST](2)-H-K-[ST]-[LF]-x-G- 

[PAC]-[RQ]-[GSA]-[GA] [K is the pyridoxal-P attachment site] 

[ 1] Usha R., Savithri H.S., Rao N.A. Biochim. Biophys. Acta 1204:75-83(1994). 

591. SIS domain 

SIS (Sugar ISomerase) domains are found in many phosphosugar isomerases and 
phosphosugar binding proteins. 

[1] Teplyakov A, Obmolova G, Badet-Denisot MA, Badet B, Polikarpov I: Structure 
1998;6:1047-1055. 

592. (SKI) Shikimate kinase signature 

Shikimate kinase (EC 2.7.1.71 ) catalyzes the fifth step in the biosynthesis from chorismate of 
the aromatic amino acids (the shikimate pathway) inbacteria (gene aroK or aroL), plants and 
in fungi (where it is part of a multifunctional enzyme which catalyzes five consecutive steps 
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in this pathway).Shikimate kinase is a small protein of about 200 residues. A conserved 
region that contains a run of three glycines has been selected as a signature pattern. 
Consensus pattern: [KR]-x(2)-E-x(3)-[LIVMF]-x(8,12)-[LIVMF](2)-[SA]-x-G(3)- x- 
[LIVMF]. Proteins belonging to this family also contain a copy of the ATP/GTP- binding 
motif W (P-loop). 

593. SNAP-25 family 

SNAP-25 (synaptosome-associated protein 25 kDa) proteins are components of 
SNARE complexes. Members of this family contain a cluster of cysteine residues that can be 
palmitoylated for membrane attachment [2]. 

[IJBrennwald P, Kearns B, Champion K, Keranen S, Bankaitis V, Novick P; Cell 
1994;79:245-258. [2] Risinger C, Blomqvist AG, Lundell I, Lambertsson A, Nassel D, 
Pieribone VA, Brodin L, Larhammar D; J Biol Chem 1993;268:24408-24414. 

594. SNF2 and others N-terminal domain 

This domain is found in proteins involved in a variety of 
processes including transcription regulation (e.g., SNF2, STHl, 
brahma, MOTl) , DNA repair (e.g., ERCC6, RAD16, RAD5), DNA 
recombination (e.g., RAD54), and chromatin unwinding (e.g., ISWI) 
as well as a variety of other proteins with little functional 
information (e.g., lodestar, ETLl). 

595. Staphylococcal nuclease homologues (Snase) 

Present in all three domains of cellular life. Four copies in the transcriptional coactivator 
plOO. These, however, appear to lack the active site residues of Staphylococcal nuclease. 
Positions 14 (Asp-21), 34 (Arg-35), 39 (Asp-40), 42 (Glu-43) andComment: 110 (Arg-87) 
[SNase numbering in parentheses] are thought to be involved in substrate-binding and 
catalysis. 
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[1] Ponting CP; Protein Sci 1997;6:459-463. [2] Callebaut I, Mornon JP; Biochem J 
1997;321:125-132. 

596. SPRY domainA 

SPRY Domain is named from SPla and the RYanodine Receptor. Domain of unknown 
function. Distant homologues are domains in Comment: butyrophilin/marenostrin/pyrin 
homologues. 

[1] Ponting C, Schultz J, Bork P; Trends Biochem Sci 1997;22:193-194. 

597. (SQS PSY) Squalene and phytoene synthases signatures 

Two different polyisoprene synthases have been shown [1,2,3] to share a number of regions 
of sequence similarities: - Squalene synthase (EC 2.5.1.21 ) (farnesyl-diphosphate 
farnesyltransferase) (SQS), which catalyzes the conversion of two molecules of farnesyl 
diphosphate (FPP) into squalene. It is the first committed step in the cholesterol biosynthetic 
pathway. The reaction carried out by SQS is catalyzed in two separate steps: the first is a 
head-to-head condensation of the two molecules of FPP to form presqualene diphosphate; 
this intermediate is then rearranged in a NADP-dependent reduction, to form squalene. SQS 
is found in eukaryotes. In yeast it is encoded by the ERG9 gene, in mammals by the FDFTl 
gene. SQS seems to be membrane-bound. - Phytoene synthase (EC 2.5.1.-) (PSY), which 
catalyzes the conversion of two molecules of geranylgeranyl diphosphate (GGPP) into 
phytoene. It is the second step in the biosynthesis of carotenoids from isopentenyl 
diphosphate. The reaction carried out by PSY is catalyzed in two separate steps: the first is a 
head-to-head condensation of the two molecules of GGPP to form prephytoene diphosphate; 
this intermediate is then rearranged to form phytoene. PSY is found in all organisms that 
synthesize carotenoids: plants and photosynthetic bacteria as well as some non- 
photosynthetic bacteria and fungi. In bacteria PSY is encoded by the gene crtB. In plants PSY 
is localized in the chloroplast. As it can be seen from the description above, both SQS and 
PSY share a number of functional similarities which are also reflected at the level of their 
primary structure. In particular three well conserved regions are shared bySQS and PSY; they 
could be involved in substrate binding and/or the catalytic mechanism. Signature patterns 
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have been developed for the second and third conserved regions; they are localized in the 
central part of these enzymes. 

Consensus pattern: Y-[CSAM]-x(2)-[VSG]-A-[GSA]-[LIVAT]-[IV]-G-x(2)-[LMSC]- x(2)- 
[LIV] 

Consensus pattern: [LIVM]-G-x(3)-Q-x(2,3)-N-[IF]-x-R-D-[LlVMFY]-x(2)-[DE]- x(4,7)-R- 
x-[FY]-x-P- 

[ 1] Summers C, Karst F., Charles A.D. Gene 136:185-192(1993).[ 2] Robinson G.W., Tsay 
Y.H., Kienzle B.K., Smith-Monroy C.A., Bishop R.W. Mol. Cell. Biol. 13:2706- 
2727(1993). [ 3] Roemer S., Hugueney P., Bouvier F., Camara B., Kuntz M. Biochem. 
Biophys. Res. Commun. 196:1414-1421(1993). 

598. SRP54-type proteins GTP-binding domain signature 

The signal recognition particle (SRP) is an oligomeric complex that mediates targeting and 
insertion of the signal sequence of exported proteins into the membrane of the endoplasmic 
reticulum. SRP consists of a 7S RNA and six protein subunits. One of these subunits, the 54 
Kd protein (SRP54), is a GTP-binding protein that interacts with the signal sequence when it 
emerges from the ribosome. The N-terminal 300 residues of SRP54 include the GTP-binding 
site (G-domain) and are evolutionary related to similar domains in other proteins which are 
listed below [1]. - Escherichia coli and Bacillus subtilis ffh protein (P48), a protein which 
seems to be the prokaryotic counterpart of SRP54. Ffh is associated with a 4.5S RNA in the 
prokaryotic SRP complex. - Signal recognition particle receptor alpha subunit (docking 
protein), an integral membrane GTP-binding protein which ensures, in conjunction with SRP, 
the correct targeting of nascent secretory proteins to the endoplasmic reticulum membrane. 
The G-domain is located at the C-terminal extremity of the protein. - Bacterial ftsY protein, a 
protein which is believed to play a similar role to that of the docking protein in eukaryotes. 
The G-domain is located at the C-terminal extremity of the protein. - The pilA protein from 
Neisseria gonorrhoeae which seems to be the homolog of ftsY. - A protein from the 
archaebacteria Sulfolobus solfataricus. This protein is also believed to be a docking protein. 
The G-domain is also at the C- terminus. - Bacterial flagellar biosynthesis protein flhF. The 
best conserved regions in those domains are the sequence motifs that are part of the GTP- 
binding site, but as those regions are not specific to these proteins, they were not used as a 
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signature pattern. Instead, a conserved region located at the C-terminal end of the domain was 
selected. 

Consensus pattern: P-[LIVM]-x-[FYL]-[LIVMAT]-[GS]-x-[GS]-[EQ]-x(4)-[LIVMF] 
[ 1] Althoff S., Selinger D., Wise J.A. Nucleic Acids Res. 22:1933-1947(1994). 



599. (STphosphatase) Serine/threonine specific protein phosphatases signature 
Serine/threonine specific protein phosphatases (EC 3.1.3.16 ) (PP) [1,2,3] are enzymes that 
catalyze the removal of a phosphate group attached to a serine or evolutionary related. - 
Protein phosphatase-1 (PPl) is an enzyme of broad specificity. It is inhibited by tv.'o 
thermostable proteins, inhibitor-1 and -2. In mammals, there are two closely related isoforms 
of PP-1: PP-lalpha and PP-lbeta, produced by alternative splicing of the same gene. In 
Emericella nidulans, PP-1 (gene bimO) plays an important role in mitosis control by 
reversing the action of the nimA kinase. In yeast, PP-1 (gene SIT4) is involved in 
dephosphorylating the large subunit of RNA polymerase II. - Protein phosphatase-2A (PP2A) 
is also an enzyme of broad specificity. PP2A is a trimeric enzyme that consist of a core 
composed of a catalytic subunit associated with a 65 Kd regulatory subunit and a third 
variable subunit. In mammals, there are two closely related isoforms of the catalytic subunit 
of PP2A: PP2A-alpha and PP2A-beta, encoded by separate genes. - Protein phosphatase-2B 
(PP2B or calcineurin), a calcium-dependent enzyme whose activity is stimulated by 
calmodulin. It is composed of two subunits: the catalytic A-subunit and the calcium-binding 
B-subunit. The specificity of PP2B is restricted.In addition to the above-mentioned enzymes, 
some additional serine/threoninespecific protein phosphatases have been characterized and 
are listed below. - Mammalian phosphatase-X (PP-X), and Drosophila phosphatase-V (PP-V) 
which are closely related but yet distinct from PP2A. - Yeast phosphatase PPH3, which is 
similar to PP2A, but with different enzymatic properties. - Drosophila phosphatase-Y (PP-Y), 
and yeast phosphatases Zl and Z2 (genes PPZl and PPZ2) which are closely related but yet 
distinct from PPl. - Drosophila retinal degeneration protein C (gene rdgC), a calcium -binding 
phosphatase required to prevent light-induced retinal degeneration. - Phages Lambda and Phi- 
80 ORF-221 which have been shown to have phosphatase activity and are related to 
mammalian PP's. The best conserved regions in these proteins is a perfectly conserved 
pentapeptide that can be used as a signature pattern. 
Consensus pattern: [LIVM]-R-G-N-H-E- 
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[ 1] Cohen P. Annu. Rev. Biochem. 58:453-508(1989).[ 2] Cohen P., Cohen P.T.W. J. Biol. 
Chem. 264:21435-21438(1989).[ 3] Cohen P.T.W., Brewis N.D., Hughes V., Mann D.J. 
FEES Lett. 268:355-359(1990). 

600. Translation initiation factor SUIl signature 

In budding yeast (Saccharomyces cerevisiae), SUIl is a translation initiation factor that 
functions in concert with eIF-2 and the initiator tRNA-Met in directing the ribosome to the 
proper start site of translation [1]. SUIl is a protein of 108 residues. Close homologs of SUIl 
have been found [2] in mammals, insects and plants. SUIl is also evolutionary related to 
hypothetical proteins from Escherichia coli (yclH), Haemophilus influenzae (HI1225) and 
Methanococcus vannielii. A conserved region in the C-terminal section has been selected as a 
signature pattern. 

Consensus pattern: [LIVM]-[EQ]-[LIVM]-Q-G-[DEN]-[KHQ]-[KRV] 

[ 1] Yoon H., Donahue T.F. Mol. Cell. Biol. 12:248-260(1992).[ 2] Fields C.A., Adams M.D. 

Biochem. Biophys. Res. Commun. 198:288-291(1994). 

601. (S T dehydratase) Serine/threonine dehydratases pyridoxal-phosphate attachment site 
Serine and threonine dehydratases [1,2] are functionally and structurally related pyridoxal- 
phosphate dependent enzymes: - L-serine dehydratase (EC 4.2.1.13 ) and D-serine 
dehydratase (EC 4.2.1.14 ) catalyze the dehydratation of L-serine (respectively D-serine) into 
ammonia and pyruvate. - Threonine dehydratase (EC 4.2.1.16 ) (TDH) catalyzes the 
dehydratation of threonine into alpha-ketobutarate and ammonia. In Escherichia coli and 
other microorganisms, two classes of TDH are known to exist. One is involved in the 
biosynthesis of isoleucine, the other in hydroxamino acid catabolism.Threonine synthase (EC 
4.2.99.2 ) is also a pyridoxal-phosphate enzyme, it catalyzes the transformation of 
homoserine-phosphate into threonine. It has been shown [3] that threonine synthase is 
distantly related to the serine/threonine dehydratases. In all these enzymes, the pyridoxal- 
phosphate group is attached to a lysine residue. The sequence around this residue is 
sufficiently conserved to allow the derivation of a pattern specific to serine/threonine 
dehydratases and threonine synthases. 
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Consensus pattern: [DESH]-x(4,5)-[STVG]-x-[AS]-[FYI]-K-[DLIFSA]-[RVMF]-[GA]- 
[LIVMGA] [The K is the pyridoxal-P attachment site] 

[ 1] Ogawa H., Gomi T., Konishi K., Date T., Naakashima H., Nose K., Matsuda Y., Peraino 
C, Pitot H.C., Fujioka M. J. Biol. Chem. 264:15818-15823(1989).[ 2] Datta P., Goss T.J., 
Omnaas J.R., Patil R.V. Proc. Natl. Acad. Sci. U.S.A. 84:393-397(1987).[ 3] Parsot C. 
EMBO J. 5:3013-3019(1986).[ 4] Grabowski R., Hofmeister A.E.M., Buckel W. Trends 
Biochem. Sci. 18:297-300(1993). 

Cysteine synthase/cystathionine beta-synthase P-phosphate attachment site 
Cysteine synthase (CSase) is the pyridoxal-phosphate dependent enzyme responsible [1] for 
the formation of cysteine from O-acetyl-serine and hydrogen sulfide with the concomitant 
release of acetic acid. In bacteria suchas Escherichia coli, two forms of the enzyme are 
known (genes cysK and cysM).In plants there are also two forms, one located in the 
cytoplasm and the otherin chloroplasts. Cystathionine beta-synthase [2] catalyzes the first 
irreversiblestep in homocysteine transulfuration; the conjugation of homocysteine andserine 
forming cystathionine. Like Csase it is a pyridoxal-phosphate dependent enzyme. The two 
types of enzymes are evolutionary related. The pyridoxal-phosphategroup of CSases has been 
shown to be attached to a lysine residue which is located in the N-terminal section of these 
enzymes; the sequence around this residue is highly conserved and can be used as a signature 
pattern to detect this class of enzymes. 

Consensus pattern: K-x-E-x(3)-[PA]-[STAGC]-x-S-[IVAP]-K-x-R-x-[STAG]-x(2)- [LIVM] 
[The 2nd K is the pyridoxal-P attachment site 

[ 1] Saito K., Kurosawa M., Murakoshi I. FEBS Lett. 328:111-114(1993).[ 2] Swaroop M., 
Bradley K., Ohura T., Tahara T., Roper M.D., Rosenberg L.E., Kraus J. P. J. Biol. Chem. 
267:11455-11461(1992). 

602. S locus glycop 

S-locus glycoprotein family. In Brassicaceae, self-incompatible plants have a self/non-self 
Comment: recognition system. This is sporophytically controlled by Comment: multiple 
alleles at a single locus (S). S-Iocus glycoproteins,Comment: as well as S-receptor kinases, 
are in linkage with the S-alleles [l].Number of members: 128 
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[1] Evolutionary aspects of the S-related genes of the Brassica self -incompatibility system: 
synonymous and nonsynonymous base substitutions. Hinata K, Watanabe M, Yamakawa S, 
Satta Y, Isogai A; Genetics 1995:140:1099-1104. [2] Polymorphism of the S-locus 
glycoprotein gene (SLG) and the S-locus related gene (SLRl) in Raphanus sativus L. and 
self-incompatible ornamental plants in the Brassicaceae. Sakamoto K, Kusaba M, Nishio T; 
Mol Gen Genet 1998;258:397-403. 



603. (sdh cyt) Succinate dehydrogenase cytochrome b subunit signatures 
Succinate dehydrogenase (SDH) is a membrane -bound complex of two main components: a 
membrane-extrinsic component composed of an FAD-binding flavoprotein and an iron-sulfur 
protein, and a hydrophobic component composed of a cytochrome B and a membrane anchor 
protein. The cytochrome b component is a mono heme transmembrane protein [1,2,3] 
belonging to a family that groups: - Cytochrome b-556 from bacterial SDH (gene sdhC). - 
Cytochrome b560 from the mammalian mitochondrial SDH complex. - Cytochrome b560 
subunit encoded in the mitochondrial genome of some algae and in the plant Marchantia 
polymorpha. - Cytochrome b from yeast mitochondrial SDH complex (gene SDH3 or CYB3). 
- Protein cyt-1 from Caenorhabditis.These cytochromes are proteins of about 130 residues 
that comprise threetransmembrane regions. There are two conserved histidines which may 
beinvolved in binding the heme group. Two signature patterns have been developed that 
include these histidine residues. 

Consensus pattern: R-P-[LIVMT]-x(3)-[LIVM]-x(6)-[LIVMWPK]-x(4)-S-x(2)-H-R-x- [ST] 
[H could be a heme ligand] 

Consensus pattern: H-x(3)-[GA]-[LIVMT]-R-[HF]-[LIVMF]-x-[FYWM]-D-x-[GVA] [H 
could be a heme ligand] 

[ 1] Yu L., Wei Y.-Y., Usui S., Yu C.-A. J. Biol. Chem. 267:24508-24515(1992).[ 2] 
Abraham P.R., Mulder A., Van't Riet J., Raue H.A. Mol. Gen. Genet. 242:708-716(1994).[ 3] 
Leblanc C, Boyen C, Richard O., Bonnard G., Grienenberger J.M., Kloareg B. J. Mol. Biol. 
250:484-495(1995). 
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[1] The Seel family: a novel family of proteins involved in synaptic transmission and 
general secretion. Halachmi N, Lev Z; J Neurochem 1996;66:889-897. 
Number of members: 40 

605. Protein secE/sec61 -gamma signature 

In bacteria, the secE protein plays a role in protein export; it is one of the components - with 
secY and secA - of the preprotein translocase. In eukaryotes, the evolutionary related protein 
sec61 -gamma playsa role in protein translocation through the endoplasmic reticulum; it is 
part of a trimeric complex that also consist of sec61-alpha and beta [1]. Both secE and sec61- 
gamma are small proteins of about 60 to 90 amino acids that contain a single transmembrane 
region at their C-terminal extremity (Escherichia colisecE is an exception, in that it possess 
an extra N-terminal segment of 60residues that contains two additional transmembrane 
domains).The sequence of secE/sec61 -gamma is not extremely well conserved, however it is 
possible to derive a signature pattern centered on a conserved proline located 10 residues 
before the beginning of the transmembrane domain. 

Consensus pattern: [LIVMFY]-x(2)-[DENQGA]-x(4)-[LIVMFTA]-x-[KRV]-x(2)-[KW]-P- 
x(3)-[SEQ]-x(7)-[LIVT]-[LIVGA]-[LIVFGAST] 

[ 1] Hartmann E., Sommer T., Prehn S., Goerlich D., Jentsch S., Rapoport T.A. Nature 
367:654-657(1994). 

606. 11-S plant seed storage proteins signature 

Plant seed storage proteins, whose principal function appears to be the major nitrogen source 
for the developing plant, can be classified, on the basis of their structure, into different 
families. 11-S are non-glycosylated proteins which form hexameric structures [1,2]. Each of 
the subunits in the hexamer is itself composed of an acidic and a basic chain derived from a 
single precursor and linked by a disulfide bond. This structure is shown in the following 
representation, + + | | 

xxxxxxxxxxxCxxxxxxxxxxxxxxxxxxxxxxNGxCxxxxxxxxxxxxxxxxxxxxxxx *********<- 

— -Acidic-subunit >< Basic-subunit > < About-480-to-500- 

residues >'C': conserved cysteine involved in a disulfide bond.'*': position of the 

pattern. Proteins that belong to the 11-S family are: pea and broad bean legumins, rape 
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cruciferin, rice glutelins, cotton beta-globulins, soybean glycinins, pumpkin 11-S globulin, 
oat globulin, sunflower helianthinin G3, etc. The region that includes the conserved cleavage 
site between the acidic and basic subunits (Asn-Gly) and a proximal cysteine residue which is 
involved in the interchain disulfide bond have been used as a signature pattern for this family 
of proteins. 

Consensus pattern: N-G-x-[DE](2)-x-[LIVMF]-C-[ST]-x(ll,12)-[PAG]-D [C is involved in a 
disulfide bond 

[ 1] Hayashi M., Mori H., Nishimura M., Akazawa T., Hara-Nishimura I. Eur. J. Biochem. 
172:627-632(1988). [ 2] Shotwell M.A., Afonso C., Davies E., Chesnut R.S., Larkins B.A. 
Plant Physiol. 87:698-704(1988). 

607. 7S seed storage protein 

7S globulin is one of the main storage proteins of most angiosperms and 
gymnosperms. The 7S storage proteins are homotrimers. 
Number of members: 67 

[1] The three-dimensional structure of canavalin from jack bean (Canavalia 
ensiformis). Ko TP, Ng JD, McPherson A; Plant Physiol 1993;101:729-744. 

608. Aspartate-semialdehyde dehydrogenase signature 

Aspartate-semialdehyde dehydrogenase (ASD) catalyzes the second step in the common 
biosynthetic pathway leading from Asp to diaminopimelate and Lys, to Met, and to Thr; the 
NADP-dependent reductive dephosphorylation of L-aspartyl phosphate to L-aspartate- 
semialdehyde. In bacteria and fungi, ASDis a protein of about 40 Kd (340 to 370 residues) 
whose sequence is not extremely well conserved [1]. A conserved cysteine residue has been 
implicated as important for the catalytic activity [2].The region of conservation around the 
active site residue is too small to be used as signature pattern. Another more conserved 
region, located in the last third of the sequence, and which contains both a conserved cysteine 
as well as an histidine has been used instead. 

Consensus pattern: [LIVM]-[SADN]-x(2)-C-x-R-[LIVM]-x(4)-[GSC]-H-[STA 

[ 1] Baril C, Richaud C, Fourni E., Baranton G., Saint Girons I. J. Gen. Microbiol. 138:47- 

53(1992).[ 2] Karsten W.E., Viola R.E. Biochim. Biophys. Acta 1121:234-238(1992). 



Reference No. 2750-942P 



502 

N-acetyl-gamma-glutamyl-phosphate reductase active site 

N-acetyl-gamma-glutamyl-phosphate reductase (EC 1.2.1.38 ) (AGPR) [1,2] is the enzyme 
that catalyzes the third step in the biosynthesis of arginine from glutamate, the NADP- 
5 dependent reduction of N-acetyl-5-glutamyl phosphate into N-acetylglutamate 5- 

semialdehyde.In bacteria it is a monofunctional protein of 35 to 38 Kd (gene argC) while in 
fungi it is part of a bifunctional mitochondrial enzyme (gene ARG5,6, argil orarg-6) which 
contains a N-terminal acetylglutamate kinase (EC 2.7.2.8 ) domain and a C-terminal AGPR 
domain. In the Escherichia coli enzyme, a cysteine has been shown to be implicated in the 
1 0 catalytic activity, the region around this residue is well conserved and can be used as a 
signature pattern. 

Consensus pattern: [LIVM]-[GSA]-x-P-G-C-[FY]-[AVP]-T-[GA]-x(3)-[GTAC]-[LIVM]- x- 
P [C is the active site residue] 

[ 1] Ludovice M., Martin J.F., Carrachas P., Liras P. J. Bacteriol. 174:4606-4613(1992).[ 2] 
15 Gessert S.F., Kim J.H., Nargang F.E., Weiss R.L. J. Biol. Chem. 269:8189-8203(1994). 

609. Sialyltransferase family. 
Number of members: 18 

20 

610. SpoU rRNA Methylase family 

This family of proteins probably use S-AdoMet. Number of members: 58 
[1] SpoU protein of Escherichia coli belongs to a new family of putative rRNA methylases. 
25 Koonin EV, Rudd KE; Nucleic Acids Res 1993;21:5519-5519. [2] The spoU gene of 
escherichia coli , the fourth gene of the spoT operon, is essential for tRNA (Gml8) 2 ' 
methyltransferase activity. Persson EC, Jager G, Gustafsson C; Nucleic Acids Res 
1997;25:4093-4097. 

30 

611. Stathmin family signatures 

Stathmin [1] (from the Greek 'stathmos' which means relay), is an ubiquitous intracellular 
protein, present in a variety of phosphorylated forms and which serves as a relay for diverse 
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second messenger pathways. Its expression and phosphorylation are regulated throughout 
development and in response to extracellular signals regulating cell proliferation, 
differentiation and function. Stathmin is a highly conserved protein of 149 amino acid 
residues. Structurally, it consists of an N-terminal domain of about 45 residues followed by a 
5 78 residue alpha-helical domain consisting of a heptad repeat coiled coil structure and a C- 
terminal domain of 25 residues. Protein SCGIO is a neuron-specific, membrane-associated 
protein that accumulates in the growth cones of developing neurons. It is highly similar in its 
sequence to stathmin, but differs in that it contains an additional N-terminal hydrophobic 
segment of 32 residues which is probably responsible for its interaction with membranes. 

1 0 Xenopus protein XB3 is also evolutionary related to stathmin and also contains an additional 
N-terminal hydrophobic domain [2]. A conserved decapeptide which ends with the first three 
residues of the coiled coil domain and a second pattern that corresponds to part of the central 
region of the coiled coil have been selected as signatures for proteins of the stathmin family. 
Consensus pattern: P-[KRQ]-[KR](2)-[DE]-x-S-L-[EG]-E- 

1 5 Consensus pattern: A-E-K-R-E-H-E-[KR]-E- 

[1] Sobel A. Trends Biochem. Sci. 16:301-305(1991).[ 2] Maucuer A., Moreau J., Mechali 
M., Sobel A. J. Biol. Chera. 268:16420-16429(1993). 

2 0 612. SUA5/yciO/yrdC family signature. The following uncharacterized proteins have been 

shown [1] to share regions of similarities: - Yeast protein SUA5. - Escherichia coli 
hypothetical protein yciO and HI1198, the corresponding Haemophilus influenzae protein. - 
Escherichia coli hypothetical protein yrdC and HI0656, the corresponding Haemophilus 
influenzae protein. - Bacillus subtilis hypothetical protein ywlC. - Mycobacterium leprae 
25 hypothetical protein in rfe-hemK intergenic region. - Methanococcus jannaschii hypothetical 
protein MJ0062.These are proteins of from 20 to 46 Kd which contain a number of conserved 
regions in their N-terminal section. They can be picked up in the database by the following 
pattern. 

3 0 Consensus pattern: [LIVMTA](3)-[LIVMFYC]-[PG]-T-[DE]-[STA]-x-[FY]-[GA]- [LIVM]- 

[GS]- 

[ 1] Bairoch A., Rudd K.E., Robison K. Unpublished observations (1995). 
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613. Sucrose synthase 

Sucrose synthases catalyse the synthesis of sucrose from UDP-glucose and fructose. This 
5 family includes the bulk of the sucrose synthase protein. However the carboxyl terminal 
region of the sucrose synthases belongs to the glycosyl transferase family Glycos transf 1 . 

614. Sulfotransferase proteins 
10 Number of members: 59 

615. Synaptophysin / synaptoporin signature 

Synaptophysin and synaptoporin [1] are structurally related proteins, found in the membrane 
15 of synaptic vesicles, which may function as ionic or solute channels. These two glycoproteins 
seem to span the membrane four times. Both their N- and C-termini sequences seem to be 
cytoplasmically located. As a signature pattern for this family of proteins, a highly conserved 
region located in the beginning of the first intravesicular loop just after the first 
transmembrane domain has been selected. This region contains a cysteine residue that may be 
2 0 involved in a disulfide bond. 

Consensus pattern: L-S-V-[DE]-C-x-N-K-T [C may be involved in a disulfide bond 
[ 1] Knaus P., Marqueze-Pouey B., Scherer H., Betz H. Neuron 5:453-462(1990). 

2 5 616. Syndecans signature 

Syndecans [1,2] (from the greek syndein; to bind together) are a family of transmembrane 
heparan sulfate proteoglycans which are implicated in the binding of extracellular matrix 
components and growth factors. Syndecans bind a variety of molecules via their heparan 
sulfate chains and can act as receptors or as co-receptors. Structurally, these proteins consist 

3 0 of four separate domains: a) A signal sequence; b) An extracellular domain (ectodomain) of 

variable length and whose sequence is not evolutionary conserved in the various forms of 
syndecans. The ectodomain contains the sites of attachment of the heparan sulfate 
glycosaminoglycan side chains; c) A transmembrane region; d) A highly conserved 
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cytoplasmic domain of about 30 to 35 residues which could interact with cytoskeletal 
proteins. The proteins known to belong to this family are: - Syndecan 1. - Syndecan 2 or 
fibroglycan. - Syndecan 3 or neuroglycan or N-syndecan. - Syndecan 4 or amphiglycan or 
ryudocan. - Drosophila syndecan. - Caenorhabditis elegans probable syndecan (F57C7.3).The 
5 signature pattern that has been developed for syndecans starts with the last residue of the 
transmembrane region and includes the first 10 residues of the cytoplasmic domain. This 
region, which contains four basic residues, could act as a stop transfer site. 
Consensus pattern: [FY]-R-[IM]-[KR]-K(2)-D-E-G-S-Y 

[ 1] Bernfield M., Kokenyesi R., Kato M., Hinkes M.T., Spring J., Gallo R.L., Lose E.J. 
1 0 Annu. Rev. Cell Biol. 8:365-393(1992).[ 2] David G. FASEB J. 7:1023-1030(1993). 

617. Syntaxin / epimorphin family signature 

The following proteins have been shown to be evolutionary related [1,2,3]: - Epimorphin (or 
1 5 syntaxin 2), a mammalian mesenchymal protein which plays an essential role in epithelial 
morphogenesis. - Syntaxin lA (also known as antigen HPC-1) and syntaxin IB which are 
synaptic proteins which may be involved in docking of synaptic vesicles at presynaptic active 
zones. - Syntaxin 3. - Syntaxin 4, which is potentially involved in docking of synaptic 
vesicles at presynaptic active zones. - Syntaxin 5, which mediates endoplasmic reticulum to 

2 0 golgi transport. - Syntaxin 6, which is involved in intracellular vesicle trafficking. - Syntaxin 

7. - Yeast PEP12 (or VPS6) which is required for the transport of proteases to the vacuole. - 
Yeast SED5 which is required for the fusion of transport vesicles with the Golgi complex. - 
Yeast SSOl and SS02 which are required for vesicle fusion with the plasma membrane, - 
Yeast VAM3, which is required for vacuolar assembly. - Arabidopsis thaliana protein 
25 KNOLLE which may be involved in cytokinesis. - Caenorhabditis elegans hypothetical 

proteins F35C8.4, F48F7.2, F55A11.2 and T01B11.3.The above proteins share the following 
characteristics: a size ranging from30 Kd to 40 Kd; a C-terminal extremity which is highly 
hydrophobic and isprobably involved in anchoring the protein to the membrane; a central, 
well conserved region, which seems to be in a coiled-coil conformation. The pattern specific 

3 0 for this family is based on the most conserved region of the coiled coil domain. 

Consensus pattern: [RQ]-x(3)-[LIVMA]-x(2)-[LIVM]-[ESH]-x(2)-[LIVMT]-x-[DEVM]- 
[LIVM]-x(2)-[LIVM]-[FS]-x(2)-[LIVM]-x(3)-[LIVT]-x(2)-Q- [GADEQ]-x(2)-[LIVM]- 
[DNQT]-x-[LIVMF]-[DESV]-x(2)-[LIVM] 
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[ 1] Bennett M.K., Garcia-Airaras J.E., Elferink L.A., Peterson K., Fleming A.M., Hazuka 
CD., Scheller R.H. Cell 74:863-873ri993). [ 2] Spring J., Kato M., Bernfield M. Trends 
Biochem. Sci. 18:124-125(1993).[ 3] Pelham H.R.B. Cell 73:425-426(19931 

5 

618. Sm protein 

The Ul, U2, U4/U6, and U5 small nuclear ribonucleoprotein 
particles (snRNPs) involved in pre-mRNA splicing contain seven 
Sm proteins (B/B', Dl, D2, D3, E, F and G) in common, which 
1 0 assemble around the Sm site present in four of the major 

spliceosomal small nuclear RNAs. These proteins contain a 
common sequence motif in two segments, Sml and Sm2, separated 
by a short variable linker. 

15 [1] Hermann H, Fabrizio P, Raker VA, Foulaki K, Hornig H, Brahms H, Luhrmann R EMBO 
J 1995;14:2076-2088. [2] Kambach C, Walke S, Young R, Avis JM, de la Fortelle E, Raker 
VA, Luhrmann R, Li J, Nagai K; Cell 1999;96:375-387. 

2 0 619. Skpl family 

[1] Stebbins CE, Kaelin WG Jr, Pavletich NP; Science 1999;284:455-461. 

2 5 620. Protein secY signatures 

The eubacterial secY protein [1] plays an important role in protein export. It interacts with the 
signal sequences of secretory proteins as well as with two other components of the protein 
translocation system: secA and secE. SecY is an integral plasma membrane protein of 419 to 
492 amino acid residues that apparently contains ten transmembrane segments. Such a 

3 0 structure probablyconfers to secY a 'translocator' function, providing a channel for 

periplasmic and outer-membrane precursor proteins.Homologs of secY are found in 
archaebacteria [2]. SecY is also encoded in the chloroplast genome of some algae [3] where it 
could be involved in a prokaryotic-like protein export system across the two membranes of 
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the chloroplast endoplasmic reticulum (CER) which is present in chromophyte 
andcryptophyte algae. Two signature patterns have been developed for secY proteins. The 
first corresponds to the second transmembrane region, which is the most conserved section of 
these proteins. The second spans the C-terminal part of the fourth transmembrane region, a 
5 short intracellular loop, and the N-terminal part of the fifth transmembrane region. 

Consensus pattern: [GST]-[LIVMF](2)-x-[LIVM]-G-[LIVM]-x-P-[LIVMFY](2)-x-[AS]- 
[GSTQ]-[LIVMFAT](3)-Q-[LIVMFA](2) 

Consensus pattern: [LIVMFYW](2)-x-[DE]-x-[LIVMF]-[STN]-x(2)-G-[LIVMF]-[GST]- 
[NST] -G-x-[GST] - [LI VMF] (3) 
10 [1] Ito K. Mol. Microbiol. 6:2423-2428(1992).[ 2] Auer J., Spicker G., Boeck A. Biochimie 
73:683-688(1991).[ 3] Douglas S.E. FEES Lett. 298:93-96(1992). 

621. (Seed protein) Small hydrophilic plant seed proteins signature. The following small 
15 hydrophilic plant seed proteins are structurally related: - Arabidopsis thaliana proteins GEAl 
and GEA6. - Cotton late embryogenesis abundant (LEA) protein D-19. - Carrot EMB-1 
protein. - Barley LEA proteins B19.1A, B19.1B, B19.3 and B19.4. - Maize late 
embryogenesis abundant protein Emb564. - Radish late seed maturation protein p8B6. - Rice 
embryonic abundant protein Empl. - Sunflower 10 Kd late embryogenesis abundant protein 

2 0 (DSIO). - Wheat Em proteins. These proteins contains from 83 to 153 amino acid residues 

and may play a role[l,2] in equipping the seed for survival, maintaining a minimal level of 
hydration in the dry organism and preventing the denaturation of cytoplasmic components. 
They may also play a role during imbibition by controlling water uptake. As a signature 
pattern, the best conserved region in the sequence of these proteins has been developed, it is a 
25 glycine-rich nonapeptide located in the N-terminal section.- 

Consensus pattern: G-[EQ]-T-V-V-P-G-G-T- 

[ 1] Dure L. Ill, Crouch M., Harada J., Ho T.-H. D., Mundy J., Quatrano R., Thomas T., Sung 

3 0 Z.R. Plant Mol. Biol. 12:475-486(1989). [ 2] Gaubier P., Raynal M., Hull G., Huestis G.M., 

Grellet F., Arenas C, Pages M., Delseny M. Mol. Gen. Genet. 238:409-418(1993). 
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622. Serine carboxypeptidases, active sites 

All known carboxypeptidases are either metallo carboxypeptidases or 
serinecarboxypeptidases. The catalytic activity of the serine carboxypeptidases, like that of 
the trypsin family serine proteases, is provided by a charge relay system involving an aspartic 
5 acid residue hydrogen-bonded to a histidine, which is itself hydrogen-bonded to a serine [1]. 
Proteins known to be serine carboxypeptidases are: - Barley and wheat serine 
carboxypeptidases I, II, and III [2]. - Yeast carboxypeptidase Y (YSCY) (gene PRCl), a 
vacuolar protease involved in degrading small peptides. - Yeast KEXl protease, involved in 
killer toxin and alpha-factor precursor processing. - Fission yeast sxa2, a probable 

1 0 carboxypeptidase involved in degrading or processing mating pheromones [3]. - Penicillium 
janthinellum carboxypeptidase SI [4]. - Aspergullus niger carboxypeptidase pepF. - 
Aspergullus satoi carboxypeptidase cpdS. - Vertebrate protective protein / cathepsin A [5], a 
lysosomal protein which is not only a carboxypeptidase but also essential for the activity of 
both beta-gal actosidase and neuraminidase. - Mosquito vitellogenic carboxypeptidase (VCP) 

15 [6]. - Naegleria fowleri virulence-related protein Nf314 [7]. - Yeast hypothetical protein 
YBR139W. - Caenorhabditis elegans hypothetical proteins C08H9.1, F13D12.6, F32A5.3, 
F41C3.5 and K10B2.2.This family also includes: - Sorghum (s)-hydroxymandelonitrile lyase 
(hydroxynitrile lyase) (HNL) [8], an enzyme involved in plant cyanogenesis. The sequences 
surrounding the active site serine and histidine residues are highly conserved in all these 

2 0 serine carboxypeptidases. 

Consensus pattern: [LIVM]-x-[GTA]-E-S-Y-[AG]-[GS] [S is the active site residue] 
Consensus pattern: [LIVF]-x(2)-[LIVSTA]-x-[IVPST]-x-[GSDNQL]-[SAGV]-[SG]-H-x- 
[IVAQ]-P-x(3)-[PSA] [H is the active site residue] 

[ 1] Liao D.L, Remington S.J. J. Biol. Chem. 265:6528-6531(1990).[ 2] Sorensen S.B., 

2 5 Svendsen I., Breddam K. Carlsberg Res. Commun. 54:193-202(1989).[ 3] Imai Y., 

Yamamoto M. Mol. Cell. Biol. 12:1827-1834(1992).[ 4] Svendsen I., Hofmann T., Endrizzi 
J., Remington J., Breddam K. FEES Lett. 333:39-43(1993).[ 5] Galjart N.J., Morreau H., 
Willemsen R., Gillemans N., Bonten E.J., d'Azzo A. J. Biol. Chem. 266:14754-14762(1991).[ 
6] Cho W.L., Deitsch K.W., Raikhel A.S. Proc. Natl. Acad. Sci. U.S.A. 88:10821- 

3 0 10824(1991).[ 7] Hu W.N., Kopachik W., Band R.N. Infect. Immun. 60:2418-2424(1992).[ 

8] Wajant H., Mundry K.W., Pfitzenmaier K. Plant Mol. Biol. 26:735-746(1994). [ 9] 
Rawlings N.D., Barrett A.J. Meth. Enzymol. 244:19-61(1994).[E1] 
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623. Serpins signature. Serpins (SERine Proteinase INhibitors) [1,2,3,4] are a group of 
structurally related proteins. They are high molecular weight (400 to 500 amino 
acids),extracellular, irreversible serine protease inhibitors with a well defined structural- 
5 functional characteristic: a reactive region that acts as a 'bait' for an appropriate serine 
protease. This region is found in the C-terminal part of these proteins. Proteins which are 
known to belong to the serpin family are listed below (references are only provided for 
recently determined sequences): - Alpha-1 protease inhibitor (alpha-l-antitrypsin, 
contrapsin). - Alpha-l-antichymotrypsin, - Antithrombin III. - Alpha-2-antiplasmin. - 

10 Heparin cofactor II. - Complement CI inhibitor. - Plasminogen activator inhibitors 1 (PAI-1) 
and 2 (PAI-2). - Glia derived nexin (GDN) (Protease nexin I). - Protein C inhibitor. - Rat 
hepatocytes SPI-1, SPI-2 and SPI-3 inhibitors. - Human squamous cell carcinoma antigen 
(SCCA) which may act in the modulation of the host immune response against tumor cells. - 
A lepidopteran protease inhibitor. - Leukocyte elastase inhibitor which, in contrast to other 

15 serpins, is an intracellular protein. - Neuroserpin [5], a neuronal inhibitor of plasminogen 
activators and plasmin. - Cowpox virus crmA [6], an inhibitor of the thiol protease 
interleukin-lB converting enzyme (ICE). CrmA is the only serpin known to inhibit a non- 
serine proteinase. - Some orthopoxviruses probable protease inhibitors, which may be 
involved in the regulation of the blood clotting cascade and/or of the complement cascade in 

2 0 the mammalian host. On the basis of strong sequence similarities, a number of proteins with 
no known inhibitory activity are said to belong to this family: - Birds ovalbumin and the 
related genes X and Y proteins. - Angiotensinogen; the precursor of the angiotensin active 
peptide. - Barley protein Z; the major endosperm albumin. - Corticosteroid binding globulin 
(CBG). - Thyroxine -binding globulin (TBG). - Sheep uterine milk protein (UTMP) and pig 

2 5 uteroferrin-associated protein (UFAP). - Hsp47, an endoplasmic reticulum heat-shock protein 

that binds strongly to collagen and could act as a chaperone in the collagen biosynthetic 
pathway [7]. - Maspin, which seems to function as a tumor supressor [5]. - Pigment 
epithelium-derived factor precursor (PEDF), a protein with a strong neu trophic activity [8]. - 
Ep45, an estrogen-regulated protein from Xenopus [9]. A signature pattern has been 

3 0 developed for this family of proteins, centered on a well conserved Pro-Phe sequence which 

is found ten to fifteen residues on the C-terminal side of the reactive bond 
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Consensus pattern: [LIVMFY]-x-[LIVMFYAC]-[DNQ]-[RKHQS]-[PST]-F-[LIVMFY]- 
[LIVMFYC]-x-[LIVMFAII]- 

[ 1] Carrell R., Travis J. Trends Biochem. Sci. 10:20-24(1985).[ 2] Carrell R., Pemberton 
5 P.A., Boswell D.R. Cold Spring Harbor Symp. Quant. Biol. 52:527-535(1987).[ 3] Huber R., 
Carrell R.W. Biochemistry 28:8951-8966(1989).[ 4] Remold-O'Donneel E. FEBS Lett. 
315:105-108(1993).[ 5] Osterwalder T., Contartese J., Stoeckli E.T., Kuhn T.B., Sonderegger 
P. EMBO J. 15:2944-2953(1996).[ 6] Komiyama T., Ray C.A., Pickup D.J., Howard A.D., 
Thornberry N.A., Peterson E.P., Salvesen G. J. Biol. Chem. 269:19331-19337(1994).[ 7] 
1 0 Clarke E., Sandwal B.D. Biochim. Biophys. Acta 1129:246-248(1992). [ 8] Zou Z., 

Anisowicz A., Neveu M., Rafidi K., Sheng S., Sager R., Hendrix M.J., Seftor E., Thor A. 
Science 263:526-529(1994).[ 9] Steele F.R., Chader G.J., Johnson L.V., Tombran-Tink J. 
Proc. Natl. Acad. Sci. U.S.A. 90:1526-1530(1993).[10] Holland L.J., Suksang C, Wall A.A., 
Roberts L.R., Moser D.R., Bhattacharya A. J. Biol. Chem. 267:7053-7059(1992). 

15 



624. Sigma-54 interaction domain signatures and profile 

Some bacterial regulatory proteins activate the expression of genes from promoters 
recognized by core RNA polymerase associated with the alternative sigma-54 factor. These 
2 0 have a conserved domain of about 230 residues involved in the ATP-dependent [1,2] 

interaction with sigma-54. This domain has been found in the proteins listed below: - acoR 
from Alcaligenes eutrophus, an activator of the acetoin catabolism operon acoXABC. - algB 
from Pseudomonas aeruginosa, an activator of alginate biosynthetic gene algD. - dctD from 
Rhizobium, an activator of dctA, the C4-dicarboxylate transport protein. - dhaR from 

2 5 Citrobacter freundii, a regulator of the dha operon for glycerol utilization. - fhlA from 

Escherichia coli, an activator of the formate dehydrogenase H and hydrogenase III structural 
genes. - fibD from Caulobacter crescentus, an activator of flagellar genes. - hoxA from 
Alcaligenes eutrophus, an activator of the hydrogenase operon. - hrpS from Pseudomonas 
syringae, an activator of hprD as well as other hrp loci involved in plant pathogenicity. - 

3 0 hupRl from Rhodobacter capsulatus, an activator of the [NiFe] hydrogenase genes hupSL. - 

hydG from Escherichia coli and Salmonella typhimurium, an activator of the hydrogenase 
activity. - levR from Bacillus subtilis, which regulates the expression of the levanase operon 
(levDEFG and sacC). - nifA (as well as anfA and vnfA) from various bacteria, an activator of 
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the nif nitrogen-fixing operon. - ntrC, from various bacteria, an activator of nitrogen 
assimilatory genes such as that for glutamine synthetase (glnA) or of the nif operon. - pgtA 
from Salmonella typhimurium, the activator of the inducible phospho- glycerate transport 
system. - pilR from Pseudomonas aeruginosa, an activator of pilin gene transcription. - rocR 
5 from Bacillus subtilis, an activator of genes for arginine utilization - tyrR from Escherichia 
coli, involved in the transcriptional regulation of aromatic amino-acid biosynthesis and 
transport. - wtsA, from Erwinia stewartii, an activator of plant pathogenicity gene wtsB. - 
xylR from Pseudomonas putida, the activator of the tol plasmid xylene catabolism operon 
xylCAB and of xylS. - Escherichia coli hypothetical protein yfhA. - Escherichia coli 

1 0 hypothetical protein yhgB. About half of these proteins (algB, dcdT, flbD, hoxA, hupRl, 
hydG, ntrC, pgtA and pilR) belong to signal transduction two-component systems [3] and 
possess a domain that can be phosphorylated by a sensor-kinase protein in their N- terminal 
section. Almost all of these proteins possess a helix-turn-helix DNA-binding domain in their 
C-terminal section. The domain which interacts with the sigma-54 factor has an ATPase 

1 5 activity. This may be required to promote a conformational change necessary for 

theinteraction [4]. The domain contains an atypical ATP-binding motif A (P-loop) as well as 
a form of motif B. The two ATP-binding motifs are located in the N-terminal section of the 
domain; signature patterns have been developed for both motifs. Other regions of the domain 
are also conserved. One of them, located in the C-terminal section, has been selected as a 

2 0 third signature pattern. 

Consensus pattern: [LIVMFY](3)-x-G-[DEQ]-[STE]-G-[STAV]-G-K-x(2)-[LIVMFY] 
Consensus pattern: [GS]-x-[LIVMF]-x(2)-A-[DNEQASH]-[GNEK]-G-[STIM]- 
[LIVMFY](3)-[DE]-[EK]-[LIVM] 

Consensus pattern: [FYW]-P-[GS]-N-[LIVM]-R-[EQ]-L-x-[NHAT] 
25 [1] Morrett E., Segovia L. J. Bacteriol. 175:6067-6074(1993).[ 2] Austin S., Kundrot C, 
Dixon R. Nucleic Acids Res. 19:2281-2287(1991).[ 3] Albright L.M., Huala E., Ausubel 
F.M. Annu. Rev. Genet. 23:311-336(1989).[ 4] Austin S., Dixon R. EMBO J. 11:2219- 
2228(1992). 

30 

625. Sigma-70 factors family signatures 

Sigma factors [1] are bacterial transcription initiation factors that promote the attachment of 
the core RNA polymerase to specific initiation sites and arethen released. They alter the 
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specificity of promoter recognition. Most bacteria express a multiplicity of sigma factors. 
Two of these factors, sigma-70 (gene rpoD), generally known as the major or primary sigma 
factor, and sigma-54 (gene rpoN or ntrA) direct the transcription of a wide variety of genes. 
The other sigma factors, known as alternative sigma factors, are required for the transcription 
5 of specific subsets of genes. With regard to sequence similarity, sigma factors can be grouped 
into two classes: the sigma-54 and sigma-70 families. The sigma-70 family includes, in 
addition to the primary sigma factor, a wide variety of sigma factors, some of which are listed 
below: - Bacillus sigma factors involved in the control of sporulation-specific genes: sigma-E 
(sigE or spoIIGB), sigma-F (sigF or spoIIAC), sigma-G (sigG or spoIIIG), sigma-H (sigH or 

1 0 spoOC) and sigma-K (sigK or spoIVCB/spoIIIC). - Escherichia coli and related bacteria 

sigma-32 (gene rpoH or htpR) involved in the expression of heat shock genes. - Escherichia 
coli and related bacteria sigma-27 (gene fliA) involved in the expression of the flagellin gene. 
- Escherichia coli sigma-S (gene rpoS or katF) which seems to be involved in the expression 
of genes required for protection against external stresses. - Myxococcus xanthus sigma-B 

1 5 (sigB) which is essential for the late-stage differentiation of that bacteria. Alignments of the 
sigma-70 family permit the identification of four regions of high conservation [2,3]. Each of 
these four regions can in turn be subdivided into a number of sub-regions. Signature patterns 
based on the two best-conserved sub-regions have been developed. The first pattern 
corresponds to sub-region 2.2;the exact function of this sub-region is not known although it 

2 0 could be involved in the binding of the sigma factor to the core RNA polymerase. The second 
pattern corresponds to sub-region 4.2 which seems to harbor a DNA-binding 'helix-turn-helix' 
motif involved in binding the conserved -35region of promoters recognized by the major 
sigma factors. The second pattern starts one residue before the N-terminal extremity of the 
HTH region and ends six residues after its C-terminal extremity. 

25 Consensus pattern: [DE]-[LIVMF](2)-[HEQS]-x-G-x-[LIVMFA]-G-L-[LIVMFYE]-x- 
[GSAM]-[LIVMAP] 

Consensus pattern: [STN]-x(2)-[DEQ]-[LIVM]-[GAS]-x(4)-[LIVMF]-[PSTG]-x(3)- 
[LIVMA]-x-[NQR]-[LIVMA]-[EQH]-x(3)-[LIVMFW]-x(2)-[LIVM] 

[ 1] Helmann J.D., Chamberlin M.J. Annu. Rev. Biochem. 57:839-872(1988).[ 2] Gribskov 
30 M., Burgess R.R. Nucleic Acids Res. 14:6745-6763(1986). [ 3] Lonetto M.A., Gribskov M., 
Gross C.A. J. Bacteriol. 174:3843-3849(1992).[ 4] Lonetto M.A., Brown K.L., Rudd K.E., 
Buttner M.J. Proc. Natl. Acad. Sci. U.S.A. 91:7573-7577(1994). 
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626. Signal carboxyl-terminal domain. 430 members. 

5 627. Signal peptidases I signatures 

Signal peptidases (SPases) [1] (also known as leader peptidases) remove the signal peptides 
from secretory proteins. In prokaryotes three types of Spases are known: type I (gene lepB) 
which is responsible for the processing of the majority of exported pre-proteins; type II (gene 
Isp) which only process lipoproteins, and a third type involved in the processing of pili 

1 0 subunits. SPase I is an integral membrane protein that is anchored in the cytoplasmic 

membrane by one (in B. subtilis) or two (in E. coli) N-terminal transmembrane domains with 
the main part of the protein protuding in the periplasmic space. Two residues have been 
shown [2,3] to be essential for the catalytic activity of SPase I: a serine and an lysine.SPase I 
is evolutionary related to the yeast mitochondrial inner membrane protease subunit 1 and 2 

15 (genes IMPl and IMP2) which catalyze the removal of signal peptides required for the 
targeting of proteins from the mitochondrial matrix, across the inner membrane, into the 
inter-membrane space [4], In eukaryotes the removal of signal peptides is effected by an 
oligomeric enzymatic complex composed of at least five subunits: the signal peptidase 
complex (SPC). The SPC is located in the endoplasmic reticulum membrane. Two 

2 0 components of mammalian SPC, the 18 Kd (SPC18) and the 21 Kd (SPC21) subunits as well 
as the yeast SECll subunit have been shown [5] to share regions of sequence similarity with 
prokaryotic SPases I and yeast IMP1/IMP2. Three signature patterns for these proteins have 
been developed. The first signature contains the putative active site serine, the second 
signature contains the putative active site lysine which is not conserved in the SPC subunits, 

2 5 and the third signature corresponds to a conserved region of unknown iological significance 
which is located in the C-terminal section of all these proteins. 
Consensus pattern: [GS]-x-S-M-x-[PS]-[AT]-[LF] [S is an active site residue] 
Consensus pattern: K-R-[LIVMSTA](2)-G-x-[PG]-G-[DE]-x-[LIVM]-x-[LIVMFY] [K is an 
active site residue] 

30 Consensus pattern: [LIVMFYW](2)-x(2)-G-D-[NH]-x(3)-[SND]-x(2)-[SG] 

[ 1] Dalbey R.E., von Heijne G. Trends Biochem. Sci. 17:474-478(1992).[ 2] Sung M., 
Dalbey R.E. J. Biol. Chem. 267:13154-13159(1992).[ 3] Black M.T. J. Bacteriol. 175:4957- 
4961(1993).[ 4] Nunnari J., Fox T.D., Walter P. Science 262:1997-2004(1993).[ 5] van Dijl 
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J.M., de Jong A., Vehmaanpera J., Venema G., Bron S. EMBO J. 11:2819-2828(1992).[ 6] 
Rawlings N.D., Barrett A.J. Meth. Enzymol. 244:19-61(1994).[E1] 



5 628. (sodcu) Copper/Zinc superoxide dismutase signatures 

Copper/Zinc superoxide dismutase (SODC) [1] is one of the three forms of an enzyme that 
catalyzes the dismutation of superoxide radicals. SODC binds one atom each of zinc and 
copper. Various forms of SODC are known: acytoplasmic form in eukaryotes, an additional 
chloropiast form in plants, an extracellular form in some eukaryotes, and a periplasmic form 

10 in prokaryotes. The metal binding sites are conserved in all the known SODC sequences [2]. 
Two signature patterns have been derived for this family of enzymes: the first one contains 
two histidine residues that bind the copper atom; the second one islocated in the C-terminal 
section of SODC and contains a cysteine which is involved in a disulfide bond. 
Consensus pattern: [GA]-[IMFAT]-H-[LIVF]-H-x(2)-[GP]-[SDG]-x-[STAGDE] [The two 

1 5 H's are copper ligands] 

Consensus pattern: G-[GN]-[SGA]-G-x-R-x-[SGA]-C-x(2)-[IV] [C is involved in a disulfide 
bond] 

[ 1] Bannister J.V., Bannister W.H., Rotilio G. CRC Crit. Rev. Biochem. 22:111-154(1987).[ 
2] Smith M.W., Doolittle R.F. J. Mol. Evol. 34:175-184(1992). 



629. (sodfe) Manganese and iron superoxide dismutases signature 

Manganese superoxide dismutase (SODM) [1] is one of the three forms of an enzyme that 
catalyzes the dismutation of superoxide radicals. The four ligands of the manganese atom are 

2 5 conserved in all the known SODM sequences. These metal ligands are also conserved in the 
related iron form ofsuperoxide dismutases [2,3]. A short conserved region which includes 
two of the four ligands: an aspartate and a histidine has been selected as a signature. 
Consensus pattern: D-x-W-E-H-[STA]-[FY](2) [D and H are manganese/iron ligands] 
[ 1] Bannister J.V., Bannister W.H., Rotilio G. CRC Crit. Rev. Biochem. 22:111-154(1987).[ 

30 2] Parker M.W., Blake C.C.F. FEES Lett. 229:377-382(1988).[ 3] Smith M.W., Doolittle 
R.F. J. Mol. Evol. 34:175-184(1992). 



Reference No. 2750-942P 

515 

630. Spectrin repeat 

Spectrin repeats are found in several proteins involved in 
cytoskeletal structure. These include spectrin, alpha-actinin 

and dystrophin.The sequence repeat used in this fannily is taken from the structural repeat in 
5 reference [2]. The spectrin repeat forms a three helix bundle. The second helix is interrupted 
by proline in some sequences. 
Number of members: 898 

[1] Actin-binding proteins. 1: Spectrin super family. Hartwig JH; Protein Profile 
1995;2:732-732. [2] Crystal structure of the repetitive segments of spectrin. Yan Y, 
1 0 Winograd E, Viel A, Cronin T, Harrison SC, Branton D; Science 1993;262:2027-2030. 

631. (subtilase) Strep tomyces subtilisin-type inhibitors signature 

Bacteria of the Streptomyces family produce a family of proteinase inhibitors[l] 
1 5 characterized by their strong activity toward subtilisin. They arecollectively known as SSI's: 
Streptomyces Subtilisin Inhibitors. Some SSI'salso inhibit trypsin or chymotrypsin. In their 
mature secreted form, SSI's areproteins of about 110 residues with two conserved disulfide 
bonds. + + + + I 1 1 I 

xxxxxxxxxxxxxxCxxxxxxxCxxxxxxxxxCx#xxxxxxxxxxxxCxxxxxx ************ 'C': 

2 0 conserved cysteine involved in a disulfide bond.'#': active site residue.'*': position of the 

pattern. 

Consensus pattern: C-x-P-x(2,3)-G-x-H-P-x(4)-A-C-[ATD]-x-L [The two C's are involved in 
a disulfide bond] 

[ 1] Taguchi S., Kojima S., Terabe M., Miura K.-L, Momose H. Eur. J. Biochem. 220:911- 
25 918(1994). 

632. Sugar transport proteins signatures 

In mammalian cells the uptake of glucose is mediated by a family of closely related transport 

3 0 proteins which are called the glucose transporters [1,2,3]. At least seven of these transporters 

are currently known to exist (in Human they are encoded by the GLUTl to GLUT7 
genes).These integral membrane proteins are predicted to comprise twelve membrane 
spanning domains. The glucose transporters show sequence similarities [4,5] with a number 
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of other sugar or metabolite transport proteins listed below (references are only provided for 
recently determined sequences). - Escherichia coli arabinose-proton symport (araE). - 
Escherichia coli galactose-proton symport (galP). - Escherichia coli and Klebsiella 
pneumoniae citrate -proton symport (also known as citrate utilization determinant) (gene cit). 
5 - Escherichia coli alpha-ketoglutarate permease (gene kgtP). - Escherichia coli 

proline/betaine transporter (gene pro?) [6]. - Escherichia coli xylose-proton symport (xylE). - 
Zymomonas mobilis glucose facilitated diffusion protein (gene glf). - Yeast high and low 
affinity glucose transport proteins (genes SNF3, HXTl to HXT14). - Yeast galactose 
transporter (gene GAL2). - Yeast maltose permeases (genes MAL3T and MAL6T). - Yeast 

10 myo-inositol transporters (genes ITRl and ITR2). - Yeast carboxylic acid transporter protein 
homolog JENl. - Yeast inorganic phosphate transporter (gene PH084). - Kluyveromyces 
lactis lactose permease (gene LAC12). - Neurospora crassa quinate transporter (gene Qa-y), 
and Emericella nidulans quinate permease (gene qutD). - Chlorella hexose carrier (gene 
HUPl). - Arabidopsis thaliana glucose transporter (gene STPl). - Spinach sucrose 

15 transporter. - Leishmania donovani transporters Dl and D2. - Leishmania enriettii probable 
transport protein (LTP). - Yeast hypothetical proteins YBR241c, YCR98c and YFL040w. - 
Caenorhabditis elegans hypothetical protein ZK637.1. - Escherichia coli hypothetical proteins 
yabE, ydjE and yhjE. - Haemophilus influenzae hypothetical proteins HI0281 and HI0418. - 
Bacillus subtilis hypothetical proteins yxbC and yxdP. It has been suggested [4] that these 

2 0 transport proteins have evolved from theduplication of an ancestral protein with six 

transmembrane regions, this hypothesis is based on the conservation of two G-R-[KR] motifs. 
The first one is located between the second and third transmembrane domains and the second 
one between transmembrane domains 8 and 9. Two patterns have been developed to detect 
this family of proteins. The first pattern is based on the G-R-[KR] motif; but because this 
25 motif is too short to be specific to this family of proteins, a pattern from a larger region 
centered on the second copy of this motif was derived. The second pattern is based on a 
number of conserved residues which are located at the end of the fourth transmembrane 
segment and in the short loop region between the fourth and fifth segments. 
Consensus pattern: [LIVMSTAG]-[LIVMFSAG]-x(2)-[LIVMSA]-[DE]-x-[LIVMFYWA]- 

3 0 G- R-[RK]-x(4,6)-[GSTA] 

Consensus pattern: [LIVMF]-x-G-[LIVMFA]-x(2)-G-x(8)-[LIFY]-x(2)-[EO]-x(6)- [RK] 

[ 1] Silverman M. Annu. Rev. Biochem. 60:757-794(1991). [ 2] Gould G.W., Bell G.I. Trends 

Biochem. Sci. 15:18-23(1990).[ 3] Baldwin S.A. Biochim. Biophys. Acta 1154:17-49(1993).[ 
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4] Maiden M.C.J., Davis E.O., Baldwin S.A., Moore D.C.M., Henderson P.J.F. Nature 
325:641-643(1987).[ 5] Henderson P.J.F. Curr. Opin. Struct. Biol. 1:590-601(1991).[ 6] 
Culham D.E., Lasby B., Marangoni A.G., Milner J.L., Steer B.A., van Nues R.W., Wood 
J.M. J. Mol. Biol. 229:268-276(1993). 

5 

633. Synaptobrevin signature 

Synaptobrevin [1] is an intrinsic membrane protein of small synaptic vesicles whose function 
is not yet known, but which is highly conserved in mammals, electric ray (where its is known 

10 as VAMP-1), Drosophila and yeast [2]. In yeast there are two closely related forms of 

synaptobrevin (genes SNCl andSNC2) while in mammals there is at least 4 (genes SYBl, 
SYB2, SYB3 and SYBLl). Structurally synaptobrevin consist of a N-terminal cytoplasmic 
domain of from 90 to 110 residues, followed by a transmembrane region, and then by a short 
(from 2 to 22 residues) C-terminal intravesicular domain. As a signature pattern for 

1 5 synaptobrevin, a highly conserved stretch of residues located in the central part of the 
sequence was selected. 

Consensus pattern: N-[LIVM]-[DENS]-[KL]-V-x-[DEQ]-R-x(2)-[KR]-[LIVM]-[STDE]- x- 
[LIVM]-x-[DE]-[KR]-[TA]-[DE] 

[ 1] Suedhof T.C., Baumert M., Perin M.S., Jahn R. Neuron 2:1475-1481(1989).[ 2] Gerst 
2 0 J.E., Rodgers L., Riggs M., Wigler M. Proc. Natl. Acad. Sci. U.S.A. 89:4338-4342(1992). 
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634. TBC domain. Identification of a TBC domain in GYP6_YEAST and GYP7_YEAST, 
which are GTPase activator proteins of yeast Ypt6 and YptV, imply that these domains are 
GTPase activator proteins of Rab-like small GTPases. Number of members: 55 

[1] Medline: 96032578. Molecular cloning of a cDNA with a novel domain present in the 
tre-2 oncogene and the yeast cell cycle regulators BUB2 and cdcl6. Richardson PM, Zon LI; 
Oncogene 1995;11:1139-1148. 

[2]Medline: 97398935. A shared domain between a spindle assembly checkpoint protein and 
Ypt/Rab-specific GTPase-activators. Neuwald AF; Trends Biochem Sci 1997;22:243-244. 

635. Transcription factor TFIID repeat signature (TBP) 

Transcription factor TFIID (or TATA-binding protein, TBP) [1,2] is a general factor that 
plays a major role in the activation of eukaryotic genes transcribed by RNA polymerase II. 
TFIID binds specifically to the TATA box promoter element which lies close to the position 
of transcription initiation. There is a remarkable degree of sequence conservation of a C- 
terminal domain of about 180 residues in TFIID from various eukaryotic sources. This region 
isnecessary and sufficient for TATA box binding. The most significant structural feature of 
this domain is the presence of two conserved repeats of a 77 amino-acid region. The 
intramolecular symmetry generates a saddle-shaped structure that sits astride the DNA [3]. 
Drosophila TRF (TBP-related factor) [4] is a sequence-specific transcription factor that also 
binds to the TATA box and is highly similar to TFIID. Archaebacteria also possess a TBP 
homolog [5]. A signature pattern that spans the last 50 residues of the repeated region has 
been derived. - 

Consensus pattern: Y-x-P-x(2)-[IF]-x(2)-[LIVM](2)-x-[KRH]-x(3)-P-[RKQ]-x(3)- L- 
[LIVM]-F-x-[STN]-G-[KR]-[LIVM]-x(3)-G-[TAGL]-[KR]-x(7)- [AGC]-x(7)-[LIVM 
[ 1] Hoffmann A., Sinn E., Yamamoto T., Wang J., Roy A., Horikoshi M., Roeder R.G. 
Nature 346:387-390(1990).[ 2] Gash A., Hoffmann A., Horikoshi M., Roeder R.G., Chua N.- 
H. Nature 346:390-394(1 990). [ 3] Nikolov D.B., Hu S.-H., Lin J., Gasch A., Hoffmann A., 
Horikoshi M., Chua N.-H., Roeder R.G., Burley S.K. Nature 360:40-46(1992).[ 4] Crowley 
T.E., Hoey T., Liu J.-K., Jan Y.N., Jan L.Y., Tjian R. Nature 361:557-561(1993).[ 5] Marsh 
T.L., Reich C.I., Whitelock R.B., Olsen G.J. Proc. Natl. Acad. Sci. U.S.A. 91:4180- 
4184(1994). 
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636. Translationally controlled tumor protein signatures (TCTP) 

Mammalian translationally controlled tumor protein (TCTP) (or P23) is a protein which has 
5 been found to be preferentially synthesized in cells during the early growth phase of some 
types of tumor [1,2], but which is also expressed in normal cells. The physiological function 
of TCTP is still not known. It is a hydrophilic protein of 18 to 20 Kd. Close homologs have 
been found in plants [3], earthworm [4], Caenorhabditis elegans (F52H2.11), Hydra, budding 
yeast (YKL056c) [5] and fission yeast (SpAClFl 2.02c) Two of the best conserved regions 
1 0 have been selected as signature patterns for TCTP. 

Consensus pattern: [IFA]-[GA]-[GAS]-N-[PAK]-S-[GA]-E-[GDE]-[PAGE]-[DEQGA] 
Consensus pattern: [FLVH]-[FY]-[IVCT]-G-E-x-[MA]-x(2,5)-[DEN]-[GAST]-x-[LV]- 
[AV]-x(3)-[FYW] 

[ 1] Boehm H., Beendorf R., Gaestel M., Gross B., Nuernberg P., Kraft R., Otto A., Bielka H. 
15 Biochem. Int. 19:277-286(1989).[ 2] Makrides S., Chitpatima S.T., Bandyopadhyay R., 

Brawerman G. Nucleic Acids Res. 16:2350-2350(1988).[ 3] Pay A., Heberle-Bors E., Hirt H. 
Plant Mol. Biol. 19:501-503(1992).[ 4] Stuerzenbaum S.R., Kille P., Morgan A.J. Biochim. 
Biophys. Acta 1398:294-304(1998).[ 5] Rasmussen S.W. Yeast 10:S63-S68(1994). 

20 

637. TFIIS zinc ribbon domain signature 

Transcription factor S-II (TFIIS) [1] is a eukaryotic protein necessary for efficient RNA 
polymerase II transcription elongation, past template-encoded pause sites. TFIIS shows 
DNA-binding activity only in the presence of RNA polymerase II. It is a protein of about 300 

25 amino acids whose sequence is highly conserved in mammals, Drosophila, yeast (where it 
was first known as PPR2, a transcriptional regulator of URA4, and then as DSTl, the DNA 
strand transfer protein alpha [2]) and in the archaebacteria Sulfolobus acidocaldarius [3]. This 
family also includes the eukaryotic and archebacterial RNA polymerase subunits of the 15 
Kd / M family (see <PDOC00790>) as well as the following viral proteins: - Vaccinia virus 

3 0 RNA polymerase 30 Kd subunit (rpo30) [4]. - African swine fever virus protein I243L 

[5]. The best conserved region of all these proteins contains four cysteines that bind a zinc ion 
and fold in a conformation termed a 'zinc ribbon' [6]. Besides these cysteines, there are a 
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number of other conserved residues which can be used to help define a specific pattern for 
this type of domain. 

Consensus pattern: C-x(2)-C-x(9)-[LIVMQSAR]-[QH]-[STQL]-[RA]-[SACR]-x-[DE]- 
[DET]-[PGSEA]-x(6)-C-x(2,5)-C-x(3)-[FW] [The four Cs are zinc ligands] 
5 [ 1] Hirashima S., Hirai H., Nakanishi Y., Natori S. J. Biol. Chem. 263:3858-3863(1988).[ 2] 
Kipling D., Kearsey S.E. Nature 353:509-509(1991).[ 3] Langer D., Zillig W. Nucleic Acids 
Res. 21:2251-2251(1993).[ 4] Ahn B.-Y., Gershon P.D., Jones E.V., Moss B. Mol. Cell. Biol. 
10:5433-5441(1990).[ 5] Rodriguez J.M., Salas M.L., Vinuela E. Virology 186:40-52(1992).[ 
6] Qian X., Jeon C, Yoon H., Agarwal K., Weiss M.A. Nature 365:277-279(1993). 

10 

638. Tetrahydro folate dehydrogenase/cyclohydrolase signatures (THE DHG CYH) 
Enzymes that participate in the transfer of one-carbon units are involved in various 
biosynthetic pathways. In many of these processes the transfers of one-carbon units are 

1 5 mediated by the coenzyme tetrahydrofolate (THF). Various reactions generate one-carbon 
derivatives of THF which can be interconverted between different oxidation states by 
formyl tetrahydrofolate synthetase(EC 6.3.4.3 ), methylenetetrahydrofolate dehydrogenase 
(EC 1.5.1.5 or EC 1.5.1.15 ) and methenyltetrahydrofolate cyclohydrolase (EC 3.5.4.9 ).The 
dehydrogenase and cyclohydrolase activities are expressed by a variety of multifunctional 

2 0 enzymes: - Eukaryotic C-1 -tetrahydrofolate synthase (Cl-THF synthase), which catalyzes all 
three reactions described above. Two forms of Cl-THE synthases are known [1], one is 
located in the mitochondrial matrix, while the second one is cytoplasmic. In both forms the 
dehydrogenase/cyclohydrolase domain is located in the N-terminal section of the 900 amino 
acids protein and consists of about 300 amino acid residues. The Cl-THF synthases are 

2 5 NADP- dependent. - Eukaryotic mitochondrial bifunctional dehydrogenase/cyclohydrolase 

[2]. This is an homodimeric NAD-dependent enzyme of about 300 amino acid residues. - 
Bacterial folD [3]. FolD is an homodimeric bifunctional NADP-dependent enzyme of about 
290 amino acid residues. The sequence of the dehydrogenase/cyclohydrolase domain is 
highly conserved in all forms of the enzyme. Two conserved regions have been selected as 

3 0 signature patterns. The first one is located in the N-terminal part of these enzymes and 

contains three acidic residues. The second pattern is a highly conserved sequence of 9 amino 
acids which is located in the C-terminal section. 
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Consensus pattern: [EQ]-x-[EQK]-[LIVM](2)-x(2)-[LIVM]-x(2)-[LIVMY]-N-x-[DN]- x(5)- 

[LIVMF](3)-Q-L-P-[LV] 

Consensus pattern: P-G-G-V-G-P-[MF]-T-[IV] 

[ 1] Shannon K.W., Rabinowitz J.C. J. Biol. Chem. 263:7717-7725(1988).[ 2] Belanger C, 
5 Mackenzie R.E. J. Biol. Chem. 264:4837-4843(1989).[ 3] d'Ari L., Rabinowitz J.C. J. Biol. 
Chem. 266:23953-23958(1991). 

639. Triosephosphate isomerase active site (TIM) 

1 0 Triosephosphate isomerase (EC 5.3.1.1) (TIM) [1] is the glycolytic enzyme that catalyzes the 
reversible interconversion of glyceraldehyde 3-phosphate and dihydroxyacetone phosphate. 
TIM plays an important role in several metabolic pathways and is essential for efficient 
energy production. It is a dimer of identical subunits, each of which is made up of about 250 
amino-acid residues. A glutamic acid residue is involved in the catalytic mechanism [2]. The 

1 5 sequence around the active site residue is perfectly conserved in all known TIM's and can be 
used as a signature pattern for this type of enzyme. 

Consensus pattern: [AV]-Y-E-P-[LIVM]-W-[SA]-I-G-T-[GK] [E is the active site residue] 
[ 1] Lolis E., Alber T., Davenport R.C., Rose D., Hartman F.C., Petsko G.A. Biochemistry 
29:6609-6618(1990).[ 2] Knowles J.R. Nature 350:121-124(1991). 

20 

640. Thymidine kinase cellular-type signature (TK) 

Thymidine kinase (TK) (EC 2.7.1.21 ) is an ubiquitous enzyme that catalyzes the ATP- 
dependent phosphorylation of thymidine. A comparison of TK sequences has shown [1,2,3] 

2 5 that there are two different families of TK. One family groups together TK from herpes 

viruses as well as cellular thymidylate kinases, while the second family currently consists of 
TK from the following sources: - Vertebrates. - Bacterial. - Bacteriophage T4. — Pox viruses. 
- African swine fever virus (ASF). - Fish lymphocystis disease virus (FLDV).A conserved 
region which is located in the C-terminal section of these enzymes has been selected as a 

3 0 signature pattern for this family of TKA. 

Consensus pattern: [GA]-x(l,2)-[DE]-x-Y-x-[STAP]-x-C-[NKR]-x-[CH]-[LIVMFYWH] 
[ 1] Boyle D.B., Coupar B.E.H., Gibbs A.J., Seigman L.J., Both G.W. Virology 156:355- 
365(1987).[ 2] Blasco R., Lopez-Otin C, Munoz M., Bockamp E.-O., Simon-Mateo C, 
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Vinuela E. Virology 178:301-304(1990).[ 3] Robertson G.R., Whalley J.M. Nucleic Acids 
Res. 16:11303-11317(1988). 



5 641. Thymidine kinase from herpesvirus (TK herpes) 
[1] 

Medline: 96003730 

Crystal structures of the thymidine kinase from herpes 
simplex virus type-1 in complex with deoxythymidine and 
1 0 ganciclovir. 

Brown DG, Visse R, Sandhu G, Davies A, Rizkallah PJ, Melitz 
C, Summers WC, Sanderson MR; 
Nat Struct Biol 1995;2:876-881. 
Number of members: 65 

15 

642. Nuclear transition protein 2 signatures (TP2) 

In mammals, the second stage of spermatogenesis is characterized by the conversion of 
nucleosomal chromatin to the compact, non-nucleosomal and transcriptionally inactive form 
2 0 found in the sperm nucleus. This condensation is associated with a double-protein transition. 
The first transition corresponds to the replacement of histones by several spermatid-specific 
proteins, also called transition proteins, which are themselves replaced by protamines during 
the second transition. Nuclear transition protein 2 (TP2) is one of those spermatid-specific 
proteins. TP2 is a basic, zinc-binding protein [1] of 116 to 137 amino-acid residues. 

2 5 Structurally, TP2 consists of three distinct parts: a conserved serine-rich N-terminal domain 

of about 25 residues, a variable central domain of 20 to 50 residues which contains cysteine 
residues, and a conserved C-terminal domain of about 70 residues rich in lysines and 
arginines. Two signature patterns for TP2 have been developed: one located in the N-terminal 
domain, the other in the C-terminal. 

3 0 Consensus pattern: H-x(3)-H-S-[NS]-S-x-P-Q-S 

Consensus pattern: K-x-R-K-x(2)-E-G-K-x(2)-K-[KR]-K 

[ 1] Baskaran R., Rao M.R.S. Biochem. Biophys. Res. Commun. 179:1491-1499(1991). 
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643. Thiamine pyrophosphate enzymes signature (TTP enzymes) 

A number of enzymes require thiamine pyrophosphate (TPP) (vitamin Bl) as a cofactor. It 
has been shown [1] that some of these enzymes are structurally related. These related TPP 
5 enzymes are: - Pyruvate oxidase (POX) (EC 1.2.3.3 ) Reaction catalyzed: pyruvate + 
orthophosphate + 0(2) + H(2)0 = acetyl phosphate + CO(2) + H(2)0(2). - Pyruvate 
decarboxylase (PDC) (EC 4.1.1.1 ) Reaction catalyzed: pyruvate = acetaldehyde + CO(2). - 
Indolepyruvate decarboxylase (EC 4.1.1.74 ) [2] Reaction catalyzed: indole-3-pyruvate = 
indole-3-acetaldehyde + CO(2). - Acetolactate synthase (ALS) (EC 4.1.3.18 ) Reaction 
10 catalyzed: 2 pyruvate = acetolactate + C0(2). - Benzoylformate decarboxylase (BED) (EC 
4.1.1.7 ) [3] Reaction catalyzed: benzoylformate - benzaldehyde + CO(2). A conserved 
region which is located in their C-terminal section has been selected as a signature pattern for 
these enzymes. 

Consensus pattern: [LIVMF]-[GSA]-x(5)-P-x(4)-[LIVMFYW]-x-[LIVMF]-x-G-D-[GSA]- 
15 [GSAC] 

[ 1] Green J.B.A. FEBS Lett. 246:1-5(1989).[ 2] Koga J., Adachi T., Hidaka H. Mol. Gen. 
Genet. 226:10-16(1991).[ 3] Tsou A.Y., Ransom S.C., Gerlt J.A., Buechter D.D., Babbitt 
P.C., Kenyon G.L. Biochemistry 29:9856-9862(1990). 

20 

644. TPR Domain 
[1] 

Medline: 95397415 

Tetratrico peptide repeat interactions: to TPR or not to TPR? 

2 5 Lamb JR, Tugendreich S, Hieter P; 

Trends Biochem Sci 1995;20:257-259. 

[2]Medline: 98151343 
The structure of the tetratricopeptide repeats of protein 
phosphatase 5: implications for TPR-mediated protein-protein 

3 0 interactions. 

Das AK, Cohen PW, Barford D; 
EMBO J 1998;17:1192-1199. 
Number of members: 621 
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645. Uroporphyrin-III C-methyltransferase signatures (TP methylase) 

Uroporphyrin-III C-methyltransferase (EC 2.1.1.107 ) (SUMT) [1,2] catalyzes the transfer of 
5 two methyl groups from S-adenosyl-L-methionine to the C-2 and C-7atoms of 

uroporphyrinogen III to yield precorrin-2 via the intermediate formation of precorrin-1. 
SUMT is the first enzyme specific to the cobalamin pathway and precorrin-2 is a common 
intermediate in the biosynthesis of corrinoids such as vitamin B12, siroheme and coenzyme 
F430.The sequences of SUMT from a variety of eubacterial and archaebacterial species are 

10 currently available. In species such as Bacillus megaterium (gene cobA), Pseudomonas 

denitrificans (cobA) or Methanobacterium ivanovii (gene corA) SUMT is a protein of about 
25 to 30 Kd. In Escherichia coli and related bacteria, the cysG protein, which is involved in 
the biosynthesis of siroheme, is a multifunctional protein composed of a N-terminal domain, 
probably involved in transforming precorrin-2 into siroheme, and a C-terminal domain which 

15 has SUMT activity. The sequence of SUMT is related to that of a number of P. denitrificans 
and Salmonella typhimurium enzymes involved in the biosynthesis of cobalamin which also 
seem to be SAM-dependent methyltransf erases [3,4]. The similarity is especially strong with 
two of these enzymes: cobl/cbiL which encodes S-adenosyl-L-methionine— precorrin-2 
methyltransferase and cobM/cbiF whose exact function is not known. Two signature patterns 

2 0 have been developed for these enzymes. The first corresponds to a well conserved region in 
the N-terminal extremity (called region 1 in [1,3]) and the second to a less conserved region 
located in the central part of these proteins (this pattern spans what are called regions 2 and 3 
in [1,3]). 

Consensus pattern: [LIVM]-[GS]-[STAL]-G-P-G-x(3)-[LIVMFY]-[LIVM]-T-[LIVM]- 
2 5 [KRHQG]-[AG] 

Consensus pattern: V-x(2)-[LI]-x(2)-G-D-x(3)-[FYW]-[GS]-x(8)-[LIVF]-x(5,6)- 
[LIVMFYWPAC]-x-[LIVMY]-x-P-G 

[ 1] Blanche F., Robin C, Couder M., Faucher D., Cauchois L., Cameron B., Crouzet J. J. 
Bacteriol. 173:4637-4645(1991).[ 2] Robin C, Blanche F., Cauchois L., Cameron B., Couder 
30 M., Crouzet J. J. Bacteriol. 173:4893-4896(1991). [ 3] Crouzet J., Cameron B., Cauchois L., 
Rigault S., Rouyez M.-C, Blanche F., Thibaut D., Debussche L. J. Bacteriol. 172:5980- 
5990(1990).[ 4] Roth J.R., Lawrence J.G., Rubenfield M., Kieffer-Higgins S., Church G.M. J. 
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Bacteriol. 175:3303-3316(1993). [ 5] Mattheakis L.C., Shen W.H., Collier R.J. Mol. Cell. 
Biol. 12:4026-4037(1992). 



5 646. Tudor domain 

Domain of unknown function present in several RNA-binding proteins, copies in the 
Drosophila Tudor protein. Sliglit ambiguities in the alignment. Number of members: 18 
[l]Medline: 97200561 Tudor domains in proteins that interact with RNA. Ponting CP; 
Trends Biochem Sci 1997;22:51-52. [2]Medline: 97157029 The human EBNA-2 
1 0 coactivator plOO: multidomain organization and relationship to the staphylococcal nuclease 
fold and to the tudor protein involved in Drosophila melanogaster development. Callebaut I, 
Mornon JP; Biochem J 1997;321:125-132. 



1 5 647. Terpene synthase family 

It has been suggested that this gene family be designated 
tps (for terpene synthase) [1]. It has been split into six 
subgroups on the basis of phylogeny, called tpsa-tpsf. 
tpsa includes vetispiridiene synthase Swiss:Q39979, 5-epi- 

2 0 aristolochene synthase, Swiss:Q40577 and (+)-delta-cadinene 
synthase Swiss:P93665. 

tpsb includes (-)-limonene synthase, Swiss:Q40322. 

tpsc includes kaurene synthase A, Swiss:O04408. 

tpsd includes taxadiene synthase, Swiss:Q41594, pinene synthase, 

2 5 Swiss:024475 and myrcene synthase, Swiss:024474. 

tpse includes kaurene synthase B. 
tpsf includes linalool synthase. 
Number of members: 51 
[1] 

3 0 Medline: 97413772 

Monoterpene synthases from grand fir (Abies grandis). cDNA 
isolation, characterization, and functional expression of 
myrcene synthase, (-)-(4S)-limonene synthase, and 
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(-)-(lS,5S)-pinene synthase. 
Bohlmann J, Steele CL, Croteau R; 
J Biol Chem 1997;272:21784-21792. 

5 

648. ThiF family 

This family contains a repeated domain in ubiquitin 
activating enzyme El and members of the bacterial 
ThiF/MoeB/HesA family.Number of members: 87 

10 

649. Thioester dehydrase 

Members of this family are involved in fatty acid biosynthesis. 
Number of members: 19 
15 [1] 

Medline: 96398612 

Structure of a dehydratase-isomerase from the bacterial 
pathway for biosynthesis of unsaturated fatty acids: two 
catalytic activities in one active site. 

2 0 Leesong M, Henderson BS, Gillig JR, Schwab JM, Smith JL; 

Structure 1996;4:253-264. 
Database Reference: SCOP; Imka; fa; [SCOP-USA] [CATH-PDBSUM] 
Database reference: PFAMB; PB058036; 

25 

650. Tub family signatures 

The mouse tubby mutation is the cause of maturity-onset obesity, insulin resistance and 
sensory deficits. This mutation maps to a gene, tub [1,2], which codes for a protein that 
belongs to a family which currently consists of the following members: - Mammalian tub, an 

3 0 hydrophilic protein of about 500 residues, which could be involved in the hypothalamic 

regulation of body weight. - Human protein TULPl [3] which may be involved in retinis 
pigmentosa 14, a retinal degeneration disease. - Mouse protein p4-6 whose function is not 
known. - Caenorhabditis elegans hypothetical protein F10B5.4. - Several fragmentary 
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sequences from plants, Drosophila and human ESTs. While the N-terminal part of these 
protein is not conserved in length nor in the sequence, the C-terminal 250 residues are highly 
conserved. Therefore, two regions were selected in the C-terminai part as signature patterns. 
The secondr egion is located at the C-terminal extremity and contains a penultimate cysteine 
5 residue that could be critical to the normal functioning of these proteins. 
Consensus pattern: F-[KHQ]-G-R-V-[ST]-x-A-S-V-K-N-F-Q 
Consensus pattern: A-F-[AG]-I-[SAC]-[LIVM]-[ST]-S-F-x-[GST]-K-x-A-C-E 
[ 1] Kleyn P.W., Fan W., Kovats S.G., Lee J.L., Pulido J.C., Wu Y., Berkemeier L.R., 
Misumi D.J., Holmgren L., Charlat O., Woolf E.A., Tayber O., Brody T., Shu P., Hawkins F., 
1 0 Kennedy B., Baldini L., Ebeling C, Alperin G.D., Deeds J., Lakey N.D., Culpepper J., Chen 
H., Gluecksmann-Kuis M.A., Carlson G.A., Duyk G.M., Moore K.J. Cell 85:281-29Qa996'). [ 
2] Noben-Trauth K., Naggert J.K., North M.A., Nishina P.M. Nature 380:534-538(1996).[ 3] 
North M.A., Naggert J.K., Yan Y., Noben-Trauth K., Nishina P.M. Proc. Natl. Acad. Sci. 
U.S.A. 94:3128-3133(1997). 

15 

651. Eukaryotic DNA topoisomerase I active site 

DNA topoisomerase I (EC 5.99.1.2 ^ [1,2,3,4,E1] is one of the two types of enzyme that 
catalyze the interconversion of topological DNA isomers. Type Itopoisomerases act by 
2 0 catalyzing the transient breakage of DNA, one strand at a time, and the subsequent rejoining 
of the strands. When a eukaryotic type Itopoisomerase breaks a DNA backbone bond, it 
simultaneously forms a protein-DNA link where the hydroxyl group of a tyrosine residue is 
joined to a 3'-phosphate on DNA, at one end of the enzyme-severed DNA strand. In 
eukaryotes and pox virus topoisomerases L there are a number of conserved residues in the 

2 5 region around the active site tyrosine. 

Consensus pattern: [DEN]-x(6)-[GS]-[IT]-S-K-x(2)-Y-[LIVM]-x(3)-[LIVM] [Y is the active 
site tyrosine] 

[ 1] Sternglanz R. Curr. Opin. Cell Biol. 1:533-535(1990).[ 2] Sharma A., Mondragon A. 
Curr. Opin. Struct. Biol. 5:39-47(1995).[ 3] Lynn R.M., Bjornsti M.-A., Caron P.R., Wang 

3 0 J.C. Proc. Natl. Acad. Sci. U.S.A. 86:3559-3563(1989).[ 4] Roca J. Trends Biochem. Sci. 

20:156-160(1995).[E1] 
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652. Transaldolase signatures 

Transaldolase (EC 2.2.1.2 ) catalyzes the reversible transfer of a three-carbonketol unit from 
sedoheptulose 7-phosphate to glyceraldehyde 3-phosphate to form erythrose 4-phosphate and 
fructose 6-phosphate. This enzyme, together with transketolase, provides a link between the 
5 glycolytic and pentose-phosphate pathways. Transaldolase is an enzyme of about 34 Kd 

whose sequence has been well conserved throughout evolution. A lysine has been implicated 
[l]in the catalytic mechanism of the enzyme; it acts as a nucleophilic group that attacks the 
carbonyl group of fructose-6-phosphate.Transaldolase is evolutionary related [2] to a 
bacterial protein of about 20Kd (known as talC in Escherichia coli), whose exact function is 
1 0 not yet known. Two signature patterns have been developed for these proteins. The first, 
located in the N-terminal section, contains a perfectly conserved pentapeptide; these cond, 
includes the active site lysine. 

Consensus pattern: [DG]-[IVSA]-T-[ST]-N-P-[STA]-[LIVMF](2) 

Consensus pattern: [LIVM]-x-[LIVM]-K-[LIVM]-[PAS]-x-[ST]-x-[DENQPAS]-G- [LIVM]- 
1 5 x-[AGV]-x-[QEKRST]-x-[LIVM] [K is the active site residue] 

[ 1] Miosga T., Schaaff-Gerstenschlaeger I., Franken E., Zimmermann F.K. Yeast 9:1241- 
1249(1993).[ 2] Reizer J., Reizer A., Saier M.H. Jr. Microbiology 141:961-971(1995). 

2 0 653. (Transpeptidase) Penicillin binding protein transpeptidase domain 

The active site serine (residue 337 in Swiss:? 14677 ) is conserved in all members of this 
family. 

2 5 [1] Pares S, Mouz N, Petillot Y, Hakenbeck R, Dideberg O Nat Struct Biol 1996;3:284-289. 

654. Trehalase signatures 

Trehalase (EC 3.2.1.28 ) is the enzyme responsible for the degradation of the disaccharide 

3 0 alpha, alpha-trehalose yielding two glucose subunits [1]. It is an enzyme found in a wide 

variety of organisms and whose sequence has been highly conserved throughout evolution. 
Two of the most highly conserved regions have been selected as signature patterns. The first 
pattern is located in the central section, the second one is in the C-terminal region. 
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Consensus pattern: P-G-G-R-F-x-E-x-Y-x-W-D-x-Y 
Consensus pattern: Q-W-D-x-P-x-[GA]-W-[PAS]-P 

[ 1] Kopp M., Mueller H., Holzer H. J. Biol. Chem. 268:4766-4774(1 993). [ 2] Henrissat B., 
Bairoch A. Biochem. J. 293:781-788(1993).[E1] 

5 

655. Trehalose-6-phosphate synthase domain 

OtsA (Trehalose-6-phosphate synthase) is homologous to regions 
in the subunits of yeast trehalose-6-phosphate synthase/phosphate complex, [1]. 
1 0 [1] Kaasen I, McDougall J, Strom AR; Gene 1994;145:9-15. 

656. Tropomyosins signature 

Tropomyosins [1,2] are family of closely related proteins present in muscle and non-muscle 
1 5 cells. In striated muscle, tropomyosin mediate the interactions between the troponin complex 
and actin so as to regulate muscle contraction. The role of tropomyosin in smooth muscle and 
non-muscle tissues is not clear. Tropomyosin is an alpha-helical protein that forms a coiled- 
coil dimer. Muscle isoforms of tropomyosin are characterized by having 284 amino acid 
residues and a highly conserved N-terminal region, vv^hereas non-muscle forms are generally 

2 0 smaller and are heterogeneous in their N-terminal region. The signature pattern for 

tropomyosins is based on a very conserved region in the C-terminal section of tropomyosins 
and which is present in both muscle and non-muscle forms. 
Consensus pattern: L-K-E-A-E-x-R-A-E 

[ 1] Smilie L.B. Trends Biochem. Sci. 4:151-155(1979).[ 2] McLeod A.R. BioEssays 6:208- 
25 212(1986). 

657. Troponin 

Troponin (Tn) contains three subunits, Ca2+ binding (TnC), 

3 0 inhibitory (Tnl), and tropomyosin binding (TnT). this Pfam contains 

members of the TnT subunit. 

Troponin is a complex of three proteins, Ca2+ binding (TnC), 
inhibitory (Tnl), and tropomyosin binding (TnT). 
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The troponin complex regulates Ca++ induced muscle contraction. 
This family includes troponin T and troponin I. Troponin I 
binds to actin and troponin T binds to tropomyosin. 
Number of members: 81 [1] 
5 Medline: 87144593 

Structure of co-crystals of tropomyosin and troponin. 
White SP, Cohen C, Phillips GN Jr; 
Nature 1987;325:826-828. [2] 
Medline: 95155315 
10 A direct regulatory role for troponin T and a dual role for 
troponin C in the Ca2+ regulation of muscle contraction. 
Potter JD, Sheng Z, Pan BS, Zhao J: 
J Biol Chem 1995;270:2557-2562. 
[3]Medline: 95324796 
1 5 The troponin complex and regulation of muscle contraction. 
Farah CS, Reinach FC; 
FASEB J 1995;9:755-767. 

2 0 658. (Tryp mucin) Mucin-like glycoprotein 

This family of trypanosomal proteins resemble vertebrate mucins. The protein consists of 
three regions. The N and C terminii are conserved between all members of the family, 
whereas the central region is not well conserved and contains a large number of threonine 
25 residues which can be glycosylated [1]. 

Indirect evidence suggested that these genes might encode the core protein of parasite 
mucins, glycoproteins that were proposed to be involved in the interaction with, and invasion 
of, mammalian host cells. 

30 [1] Di Noia JM, Sanchez DO, Frasch AC; J Biol Chem 1995;270:24146-24149. 

[2] Di Noia JM, D'Orso I, Aslund L, Sanchez DO, Frasch AC; J Biol Chem 1998;273:10843- 
10850. 
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659. Aminoacyl-transfer RNA synthetases class-I signature (tRNA synt 1) 
Aminoacyl-tRNA synthetases (EC 6.1.1.-) [1] are a group of enzymes which activate amino 
acids and transfer them to specific tRNA molecules as the first step in protein biosynthesis. In 

5 prokaryotic organisms there are at least twenty different types of aminoacyl-tRNA 
synthetases, one for each differentamino acid. In eukaryotes there are generally two 
aminoacyl-tRNA synthetases for each different amino acid: one cytosolic form and a 
mitochondrial form. While all these enzymes have a common function, they are widely 
diverse interms of subunit size and of quaternary structure. A few years ago it was found [2] 

1 0 that several aminoacyl-tRNA synthetases share a region of similarity in their N-terminal 
section, in particular the consensus tetrapeptide His-Ile-Gly-His ('HIGH') is very well 
conserved. The 'HIGH' region has been shown [3] to be part of the adenylate binding site. 
The 'HIGH' signature has been found in the aminoacyl-tRNA synthetases specific for 
arginine, cysteine, glutamic acid, glutamine, isoleucine, leucine, methionine, tyrosine, 

1 5 tryptophan, and valine. These aminoacyl-tRNA synthetases are referred to as class-I 

synthetases [4,5,6] and seem to share the same tertiary structure based on a Rossmann fold. 
Consensus pattern: P-x(0,2)-[GSTAN]-[DENQGAPK]-x-[LIVMFP]-[HT]-[LIVMYAC]-G- 
[HNTG]-[LIVMFYSTAGPC] 

[ 1] Schimmel P. Annu. Rev. Biochem. 56:125-158(1987).[ 2] Webster T., Tsai H., Kula M., 
2 0 Mackie G.A., Schimmel P. Science 226:1315-1317(1984).[ 3] Brick P., Bhat T.N., Blow 
D.M. J. Mol. Biol. 208:83-98(1988).[ 4] Delarue M., Moras D. BioEssays 15:675- 
687(1993).[ 5] Schimmel P. Trends Biochem. Sci. 16:1-3(1991).[ 6] Nagel G.M., Doolittle 
R.F. Proc. Natl. Acad. Sci. U.S.A. 88:8121-8125(1991). 

25 

660. Aminoacyl-transfer RNA synthetases class-I signature (tRNA synt lb) 
Aminoacyl-tRNA synthetases (EC 6.1.1.-) [1] are a group of enzymes which activate amino 
acids and transfer them to specific tRNA molecules as the first step in protein biosynthesis. In 
prokaryotic organisms there are at least twenty different types of aminoacyl-tRNA 

30 synthetases, one for each different amino acid. In eukaryotes there are generally two 
aminoacyl-tRNA synthetases for each different amino acid: one cytosolic form and a 
mitochondrial form. While all these enzymes have a common function, they are widely 
diverse in terms of subunit size and of quaternary structure. A few years ago it was found [2] 
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that several aminoacyl-tRNA synthetases share a region of similarity in their N-terminal 
section, in particular the consensus tetrapeptide His-Ile-Gly-His ('HIGH') is very well 
conserved. The 'HIGH' region has been shown [3] to be part of the adenylate binding site. 
The 'HIGH' signature has been found in the aminoacyl-tRNA synthetases specific 
5 forarginine, cysteine, glutamic acid, glutamine, isoleucine, leucine, methionine, tyrosine, 
tryptophan, and valine. These aminoacyl-tRNA synthetases are referred to as class-I 
synthetases [4,5,6] and seem to share the same tertiary structure based on a Rossmann fold. 
Consensus pattern: P-x(0,2)-[GSTAN]-[DENQGAPK]-x-[LIVMFP]-[HT]-[LIVMYAC]-G- 
[HNTG]-[LIVMFYSTAGPC 
10 [1] Schimmel P. Annu. Rev. Biochem. 56:125-158(1987).[ 2] Webster T., Tsai H., Kula M., 
Mackie G.A., Schimmel P. Science 226:1315-1317(1984).[ 3] Brick P., Bhat T.N., Blow 
D.M. J. Mol. Biol. 208:83-98(1988).[ 4] Delarue M., Moras D. BioEssays 15:675- 
687(1993).[ 5] Schimmel P. Trends Biochem. Sci. 16:1-3(1991).[ 6] Nagel G.M., Doolittle 
R.F. Proc. Natl. Acad. Sci. U.S.A. 88:8121-8125(1991). 

15 

661. (tRNA-synt IC) tRNA synthetases class I (E and Q) 

Other tRNA synthetase sub-families are too dissimilar to be included. 

2 0 This family includes only glutamyl and glutaminyl tRNA synthetases. 

In some organisms, a single glutamyl-tRNA synthetase aminoacylates both tRNA(Glu) and 
tRNA(Gln). 

[1] Rath VL, Silvian LP, Beijer B, Sproat BS, Steitz TA; Structure 1998;6:439-449. 

25 

662. (tRNA-synt Id) tRNA synthetases class I (R) 

Other tRNA synthetase sub-families are too dissimilar to be included. 

3 0 This family includes only arginyl tRNA synthetase. 

663. Aminoacyl-transfer RNA synthetases class-II signatures (tRNA synt 2) 
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Aminoacyl-tRNA synthetases (EC 6.1.1.-) [1] are a group of enzymes which activate amino 
acids and transfer them to specific tRNA molecules as the first step in protein biosynthesis. In 
prokaryotic organisms there are at least twenty different types of aminoacyl-tRNA 
synthetases, one for each different amino acid. In eukaryotes there are generally two 
5 aminoacyl-tRNA synthetases for each different amino acid: one cytosolic form and a 
mitochondrial form. While all these enzymes have a common function, they are widely 
diverse interms of subunit size and of quaternary structure. The synthetases specific for 
alanine, asparagine, aspartic acid, glycine, histidine, lysine, phenylalanine, proline, serine, 
and threonine are referred to as class-II synthetases [2 to 6] and probably have a common 

1 0 folding pattern in their catalytic domain for the binding of ATP and amino acid which is 
different to the Rossmann fold observed for the class I synthetases [7]. Class-II tRNA 
synthetases do not share a high degree of similarity, however at least three conserved regions 
are present [2,5,8]. Signature patterns have been derived from two of these regions. 
Consensus pattern: [FYH]-R-x-[DE]-x(4,12)-[RH]-x(3)-F-x(3)-[DE 

15 Consensus pattern: [GSTALVF]-{DENQHRKP}-[GSTA]-[LIVMF]-[DE]-R-[LIVMF]-x- 
[LIVMSTAG] -[LIVMFY] 

[ 1] Schimmel P. Annu. Rev. Biochem. 56:125-158(1987).[ 2] Delarue M., Moras D. 
BioEssays 15:675-687(1993).[ 3] Schimmel P. Trends Biochem. Sci. 16:1-3(1991).[ 4] Nagel 
G.M., Doolittle R.F. Proc. Natl. Acad. Sci. U.S.A. 88:8121-8125(1991). [ 5] Cusack S., 

2 0 Haertlein M., Leberman R. Nucleic Acids Res. 19:3489-3498(1991).[ 6] Cusack S. 

Biochimie 75:1077-1081(1993).[ 7] Cusack S., Berthet-Colominas C, Haertlein M., Nassar 
N., Leberman R. Nature 347:249-255(1990).[ 8] Leveque F., Plateau P., Dessen P., Blanquet 
S. Nucleic Acids Res. 18:305-312(1990). 

25 

664. Aminoacyl-transfer RNA synthetases class-I signature (tRNA synt le) 
Aminoacyl-tRNA synthetases (EC 6.1.1.-) [1] are a group of enzymes which activate amino 
acids and transfer them to specific tRNA molecules as the first step in protein biosynthesis. In 
prokaryotic organisms there are at least twenty different types of aminoacyl-tRNA 

3 0 synthetases, one for each different amino acid. In eukaryotes there are generally two 

aminoacyl-tRNA synthetases for each different amino acid: one cytosolic form and a 
mitochondrial form. While all these enzymes have a common function, they are widely 
diverse in terms of subunit size and of quaternary structure. A few years ago it was found [2] 
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that several aminoacyl-tRNA synthetases share a region of similarity in their N-terminal 
section, in particular the consensus tetrapeptide His-Ile-Gly-His ('HIGH') is very well 
conserved. The 'HIGH' region has been shown [3] to be part of the adenylate binding site. 
The 'HIGH' signature has been found in the aminoacyl-tRNA synthetases specific 
5 forarginine, cysteine, glutamic acid, glutamine, isoleucine, leucine, methionine, tyrosine, 
tryptophan, and valine. These aminoacyl-tRNA synthetases are referred to as class-I 
synthetases [4,5,6] and seem to share the same tertiary structure based on a Rossmann fold. 
Consensus pattern: P-x(0,2)-[GSTAN]-[DENQGAPK]-x-[LIVMFP]-[HT]-[LIVMYAC]-G- 
[HNTG]-[LIVMFYSTAGPC 
10 [1] Schimmel P. Annu. Rev. Biochem. 56:125-158(1987).[ 2] Webster T., Tsai H., Kula M., 
Mackie G.A., Schimmel P. Science 226:1315-1317(1984).[ 3] Brick P., Bhat T.N., Blow 
D.M. J. Mol. Biol. 208:83-98(1988).[ 4] Delarue M., Moras D. BioEssays 15:675- 
687(1993).[ 5] Schimmel P. Trends Biochem. Sci. 16:1-3(1991).[ 6] Nagel G.M., Doolittle 
R.F. Proc. Natl. Acad. Sci. U.S.A. 88:8121-8125(1991). 

15 

665. Aininoacyl-transfer RNA synthetases class-II signatures (tRNA synt 2b) 
Aminoacyl-tRNA synthetases (EC 6.1.1.-) [1] are a group of enzymes which activate amino 
acids and transfer them to specific tRNA molecules as the first step in protein biosynthesis. In 
2 0 prokaryotic organisms there are at least twenty different types of aminoacyl-tRNA 
synthetases, one for each different amino acid. In eukaryotes there are generally two 
aminoacyl-tRNA synthetases for each different amino acid: one cytosolic form and a 
mitochondrial form. While all these enzymes have a common function, they are widely 
diverse interms of subunit size and of quaternary structure. The synthetases specific for 

2 5 alanine, asparagine, aspartic acid, glycine, histidine, lysine, phenylalanine, proline, serine, 

and threonine are referred to as class-II synthetases [2 to 6] and probably have a common 
folding pattern in their catalytic domain for the binding of ATP and amino acid which is 
different to the Rossmann fold observed for the class I synthetases [7]. Class-II tRNA 
synthetases do not share a high degree of similarity, however at least three conserved regions 

3 0 are present [2,5,8]. Signature patterns have been derived from two of these regions. 

Consensus pattern: [FYH]-R-x-[DE]-x(4,12)-[RH]-x(3)-F-x(3)-[DE 

Consensus pattern: [GSTALVF]-{DENQHRKP}-[GSTA]-[LIVMF]-[DE]-R-[LIVMF]-x- 
[LIVMSTAG]-[LIVMFY] 
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[ 1] Schimmel P. Annu. Rev. Biochem. 56:125-158(1987).[ 2] Delarue M., Moras D. 
BioEssays 15:675-687(1993).[ 3] Schimmel P. Trends Biochem. Sci. 16:1-3(1991).[ 4] Nagel 
G.M., Doolittle R.F. Proc. Natl. Acad. Sci. U.S.A. 88:8121-8125(1991). [ 5] Cusack S., 
Haertlein M., Leberman R. Nucleic Acids Res. 19:3489-3498(1991).[ 6] Cusack S. 
5 Biochimie 75:1077-1081(1993).[ 7] Cusack S., Berthet-Colominas C, Haertlein M., Nassar 
N., Leberman R. Nature 347:249-255(1990).[ 8] Leveque R, Plateau P., Dessen P., Blanquet 
S. Nucleic Acids Res. 18:305-312(1990). 



1 0 666. Thaumatin family signature 

Thaumatin [1] is an intensively sweet-tasting protein (100 000 times sweeter than sucrose on 
a molar basis) from Thaumatococcus daniellii, an African brush. The protein is made of about 
200 residues and contains 8 disulfide bonds. A number of proteins have been found to be 
related to thaumatins. These protein are listed below (references are only provided for 

1 5 recently determined sequences). - A maize alpha-amylase/trypsin inhibitor. - Two tobacco 
pathogenesis-related proteins: PR-R major and minor forms, which are induced after 
infection with viruses. - Salt-induced protein NP24 from tomato. - Osmotin, a salt-induced 
protein from tobacco. - Osmotin-like proteins OSML13, OSML15 and OSML81 from potato 
[2]. - P21, a leaf protein from soybean. - PWIR2, a leaf protein from wheat. - Zeamatin, a 

2 0 maize antifunal protein [3]. The exact biological function of all these proteins is not yet 

known. A conserved region that includes three cysteine residues known (in thaumatin) to be 
involved in disulfide bonds has been selected as a signature pattern. 



25 xxCxxxxxxxxxxxxxxxxCxxCxxCxCxxxxxxxxxxxxxxCxxCxCxxxCxCxxCCxCxxxCxxxxxC 

xxxCx I I M I I I I II I I +--+ +-+ I +— + +--++-+ I + +'C': conserved cysteine 

involved in a disulfide bond.'*': position of the pattern. 

Consensus pattern: G-x-[GF]-x-C-x-T-[GA]-D-C-x(l,2)-G-x(2,3)-C 

[ 1] Edens L., Heslinga L., Klok R., Lxdeboer A.M., Maat J., Toonen M.Y., Visser C, 

3 0 Verrips C.T. Gene 18:1-12(1982).[ 2] Zhu B., Chen T.H.H., Li P.H. Plant Physiol. 108:929- 
937(1995).[ 3] Malehorn D.E., Borgmeyer J.R., Smith C.E., Shah D.M.; Plant Physiol. 
106:1471-1481(1994). 
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667. Thiolases signatures 

Two different types of thiolase [1,2,3] are found both in eukaryotes and in prokaryotes: 
acetoacetyl-CoA thiolase (EC 2.3.1.9 ) and 3-ketoacyi-CoA thiolase(EC 2.3.1.16 ). 3-ketoacyl- 
5 CoA thiolase (also called thiolase I) has a broad chain-length specificity for its substrates and 
is involved in degradative pathways such as fatty acid beta-oxidation. Acetoacetyl-CoA 
thiolase (also called thiolase II) is specific for the thiolysis of acetoacetyl-CoA and involved 
in biosynthetic pathways such as poly beta-hydroxybutyrate synthesisor steroid biogenesis. In 
eukaryotes, there are two forms of 3-ketoacyl-CoA thiolase: one located in the mitochondrion 

10 and the other in peroxisomes. There are two conserved cysteine residues important for 

thiolase activity. The first located in the N-terminal section of the enzymes is involved in the 
formation of an acyl-enzyme intermediate; the second located at the C-terminal extremity is 
the active site base involved in deprotonation in the condensation reaction. Mammalian 
nonspecific lipid-transfer protein (nsL-TP) (also known as sterol carrier protein 2) is a protein 

1 5 which seems to exist in two different forms: a 14 Kd protein (SCP-2) and a larger 58 Kd 

protein (SCP-x). The former is found in the cytoplasm or the mitochondria and is involved in 
lipid transport; the latter is found in peroxisomes. The C-terminal part of SCP-x is identical to 
SCP-2 while the N-terminal portion is evolutionary related to thiolases[4]. Three signature 
patterns have been developed for this family of proteins, two of which are based on the 

2 0 regions around the biologically important cysteines. The third is based on a highly conserved 
region in the C-terminal part of these proteins. 

Consensus pattern: [LIVM]-[NST]-x(2)-C-[SAGLI]-[ST]-[SAG]-[LIVMFYNS]-x- [STAG]- 
[LIVM]-x(6)-[LIVM] [C is involved in formation of acyl-enzyme intermediate] 
Consensus pattern: N-x(2)-G-G-x-[LIVM]-[SA]-x-G-H-P-x-[GA]-x-[ST]-G 
25 Consensus pattern: [AG]-[LIVMA]-[STAGCLIVM]-[STAG]-[LIVMA]-C-x-[AG]-x-[AG]- 
X- [AG]-x-[SAG] [C is the active site residue] 

[ 1] Peoples O.P., Sinskey A.J. J. Biol. Chem. 264:15293-15297(1989).[ 2] Yang S.-Y., Yang 
X.-Y.H., Healy-Louie G., Schulz H., Elzinga M. J. Biol. Chem. 265: 10424-10429(1 990). [ 3] 
Igual J.C., Gonzalez-Bosch C, Dopazo J., Perez-Ortin I.E. J. Mol. Evol. 35:147-155(1992).[ 
30 4] Baker M.E., Billheimer J.T., Strauss J.F. Ill DNA Cell Biol. 10:695-698(1991). 

668. Thioredoxin family active site 
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Thioredoxins [1 to 4] are small proteins of approximately one hundred amino-acid residues 
which participate in various redox reactions via the reversible oxidation of an active center 
disulfide bond. They exist in either a reduced form or an oxidized form where the two 
cysteine residues are linked in an intramolecular disulfide bond. Thioredoxin is present in 
prokaryotes and eukaryotes and the sequence around the redox-active disulfide bond is 
wellconserved. Bacteriophage T4 also encodes for a thioredoxin but its primary structure is 
not homologous to bacterial, plant and vertebrate thioredoxins. A number of eukaryotic 
proteins contain domains evolutionary related tothioredoxin, all of them seem to be protein 
disulphide isomerases (PDI). PDI(EC 5.3.4.1 ) [5,6,7] is an endoplasmic reticulum enzyme 
that catalyzes the rearrangement of disulfide bonds in various proteins. The various forms of 
PDI which are currently known are: - PDI major isozyme; a multifunctional protein that also 
function as the beta subunit of prolyl 4-hydroxylase (EC 1.14.11.2 \ as a component of 
oligosaccharyl transferase (EC 2.4.1.119 ), as thyroxine deiodinase (EC 3.8. 1.4), as 
glutathione-insulin transhydrogenase (EC 1.8.4.2 ) and as a thyroid hormone-binding protein ! 
- ERp60 (ER-60; 58 Kd microsomal protein). ERp60 was originally thought to be a 
phosphoinositide-specific phospholipase C isozyme and later to be a protease. - ERp72. - 
P5.AI1 PDI contains two or three (ERp72) copies of the thioredoxin domain. Bacterial 
proteins that act as thiol: disulfide interchange proteins thatallows disulfide bond formation in 
some periplasmic proteins also contain a thioredoxin domain. These proteins are: - 
Escherichia coli dsbA (or prfA) and its orthologs in Vibrio cholerae (tcpG) and Haemophilus 
influenzae (por). - Escherichia coli dsbC (or xpRA) and its orthologs in Erwinia chrysanthemi 
and Haemophilus influenzae. - Escherichia coli dsbD (or dipZ) and its Haemophilus 
influenzae ortholog. - Escherichia coli dsbE (or ccmG) and orthologs in Haemophilus 
influenzae, Rhodobacter capsulatus (helX), Rhiziobiacae (cycY and tlpA). 
Consensus pattern: [LIVMF]-[LIVMSTA]-x-[LIVMFYC]-[FYWSTHE]-x(2)-[FYWGTN]- 
C- [GATPLVE]-[PHYWSTA]-C-x(6)-[LIVMFYWT] [The two C's form the redox-active 
bond] 

[ 1] Holmgren A. Annu. Rev. Biochem. 54:237-271(1985).[ 2] Gleason F.K., Holmgren A. 
FEMS Microbiol. Rev. 54:271-297(1988).[ 3] Holmgren A. J. Biol. Chem. 264:13963- 
13966(1989).[ 4] Eklund H., Gleason F.K., Holmgren A. Proteins 11:13-28(1991).[ 5] 
Freedman R.B., Hawkins H.C., Murant S.J., Reid L. Biochem. Soc. Trans. 16:96-99(1988).[ 
6] Kivirikko K.I., Myllyla R., Pihlajaniemi T. FASEB J. 3:1609-1617(1989).[ 7] Freedman 
R.B., Hirst T.R., Tuite M.F. Trends Biochem. Sci. 19:331-336(1994). 
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669. (Transcript fac2) Transcription factor TFIIB repeat signature 

In eukaryotes the initiation of transcription of protein encoding genes by polymerase II is 
5 modulated by general and specific transcription factors. The general transcription factors 
operate through common promoters elements (such as the TATA box). At least seven 
different proteins associates to form the general transcription factors: TFIIA, -IIB, -IID, -HE, 
-IIF, TIG, and -IIH[l].Transcription factor IIB (TFIIB) plays a central role in the 
transcription of class II genes, it associates with a complex of TFIID-IIA bound to DNA (DA 

1 0 complex) to form a ternary complex TFIID-IIA-IBB (DAB complex) which is then 

recognized by RNA polymerase II [2,3]- TFIIB is a protein of about 315 to 340amino acid 
residues which contains, in its C-terminal part an imperfect repeat of a domain of about 75 
residues. This repeat could contribute an element of symmetry to the folded protein. The 
following proteins have been shown to be evolutionary related to TFIIB: - An archaebacterial 

15 TFIIB homolog. In Pyrococcus woesei a previously undetected open reading frame has been 
shown [4] to be highly related to TFIIB. - Fungal transcription factor IIIB 70 Kd subunit 
(gene PCF4/TDS4/BRF1) [5]. This protein is a general activator of RNA polymerase III 
transcription and plays a role analogous to that of TFIIB in pol III transcription. The central 
section of the repeated domain, which is the most conserved part of that domain has been 

20 selected as a signature pattern. 

Consensus pattern: G-[KR]-x(3)-[STAGN]-x-[LIVMYA]-[GSTA](2)-[CSAV]-[LIVM]- 
[LIVMFY]-[LIVMA]-[GSA]-[STAC 

[ 1] Weinmann R. Gene Expr. 2:81-91(1992).[ 2] Hawley D. Trends Biochem. Sci. 16:317- 
318(1991).[ 3] Ha I., Lane W.S., Reinberg D. Nature 352:689-695(1991).[ 4] Ouzounis C., 
2 5 Sander C. Cell 71:189-190a992). [ 5] Khoo B., Brophy B., Jackson S.P. Genes Dev. 8:2879- 
2890(1994). 

670. (transcritp fact) MADS-box domain signature and profile 

30 A number of transcription factors contain a conserved domain of 56 amino-acid residues, 

sometimes known as the MADS-box domain [El]. They are listed below: - Serum response 
factor (SRF) [1], a mammalian transcription factor that binds to the Serum Response Element 
(SRE). This is a short sequence of dyad symmetry located 300 bp to the 5' end of the 
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transcription initiation site of genes such as c-fos. - Mammalian myocyte-specific enhancer 
factors 2A to 2D (MEF2A to MEF2D). These proteins are transcription factor which binds 
specifically to the MEF2 element present in the regulatory regions of many muscle-specific 
genes. - Drosophila myocyte-specific enhancer factor 2 (MEF2). - Yeast GRM/PRTF protein 
5 (gene MCMl) [2], a transcriptional regulator of mating-type-specific genes, - Yeast arginine 
metabolism regulation protein I (gene ARGRl or ARG80). - Yeast transcription factor 
RLMl. - Yeast transcription factor SMPl. - Arabidopsis thaliana agamous protein (AG) [3], a 
probable transcription factor involved in regulating genes that determines stamen and carpel 
development in wild-type flowers. Mutations in the AG gene result in the replacement of the 

1 0 stamens by petals and the carpels by a new flower. - Arabidopsis thaliana homeotic proteins 
Apetalal (API), ApetalaS (AP3) and Pistillata (PI) which act locally to specify the identity of 
the floral meristem and to determine sepal and petal development [4]. - Antirrhinum majus 
and tobacco homeotic protein deficiens (DEFA) and globosa (GLO) [5]. Both proteins are 
transcription factors involved in the genetic control of flower development. Mutations in 

15 DEFA or GLO cause the transformation of petals into sepals and of stamina into carpels. - 
Arabidopsis thaliana putative transcription factors AGLl to AGL6 [6], - Antirrhinum majus 
morphogenetic protein DEF H33 (squamosa). In SRF, the conserved domain has been shown 
[1] to be involved in DNA-binding and dimerization. A pattern that spans the complete length 
of the domain has been derived. The profile also spans the length of the MADS-box. 

2 0 Consensus pattern: R-x-[RK]-x(5)-I-x-[DNGSK]-x(3)-[KR]-x(2)-T-[FY]-x-[RK](3)- x(2)- 

[LIVM]-x-K(2)-A-x-E-[LIVM]-[STA]-x-L-x(4)-[LIVM]-x- [LIVM](3)-x(6)-[LIVMF]-x(2)- 
[FY] 

[ 1] Norman C., Runswick M., Pollock R., Treisman R. Cell 55:989-10Q3(1988). [ 2] 
Passmore S., Maine G.T., Elble R., Christ C, Tye B.-K. J. Mol. Biol. 204:593-606(1988).[ 3] 
2 5 Yanofsky M., Ma H., Bowman J., Drews G., Feldmann K.A., Meyerowitz E.M. Nature 
346:35-39(1990).[ 4] Goto K., Meyerowitz E.M. Genes Dev. 8:1548-1560(1994).[ 5] 
Troebner W., Ramirez L., Motte P., Hue I., Huijser P., Loennig W.-E., Saedler H., Sommer 
H., Schwartz-Sommer Z. EMBO J. 11:4693-4704(1992).[ 6] Ma H., Yanofsky M.F., 
Meyerowitz E.M. Genes Dev. 5:484-495(1991).[El] 

30 

671. Transketolase signatures 
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Transketolase (EC 2.2.1.1 ') (TK) catalyzes the reversible transfer of a two-carbon ketol unit 
from xylulose 5-phosphate to an aldose receptor, such as ribose 5-phosphate, to form 
sedoheptulose 7-phosphate and glyceraldehyde 3-phosphate. This enzyme, together with 
transaldolase, provides a link between the glycolytic and pentose-phosphate pathways. TK 
requires thiamin pyrophosphate as a cofactor. In most sources where TK has been purified, it 
is a homodimer of approximately 70 Kd subunits. TK sequences from a variety of eukaryotic 
and prokaryotic sources [1,2] show that the enzyme has been evolutionarily conserved. In the 
peroxisomes of methylotrophic yeast Hansenula polymorpha, there is a highly related 
enzyme, dihydroxy-acetone synthase (DHAS) (EC 2.2.1.3 ) (also known as formaldehyde 
transketolase), which exhibits a very unusual specificity by including formaldehyde amongst 
its substrates, l-deoxyxylulose-5-phosphate synthase (DXP synthase) [3] is an enzyme so far 
found in bacteria (gene dxs) and plants (gene CLAl) which catalyzes the thiamin 
pyrophosphoate-dependent acyloin condensation reaction between carbon atoms 2 and 3 of 
pyruvate and glyceraldehyde 3-phosphate to yield 1-deoxy-D- xylulose-5-phosphate (dxp), a 
precursor in the biosynthetic pathway to isoprenoids, thiamin (vitamin Bl), and pyridoxol 
(vitamin B6). DXP synthase is evolutionary related to TK. Two regions of TK have been 
selected as signature patterns. The first, located in the N-terminal section, contains a histidine 
residue which appears to function inproton transfer during catalysis [4]. The second, located 
in the central section, contains conserved acidic residues that are part of the active cleft and 
may participate in substrate -binding [4]. 

Consensus pattern: R-x(3)-[LIVMTA]-[DENQSTHKF]-x(5,6)-[GSN]-G-H-[PLIVMF]- 
[GSTA]-x(2)-[LIMC]-[GS 

Consensus pattern : G-[DEQGS A] -[DN] -G-[P AEQ]-[ST] -[HQ] -x-[PAGM] -[LIVM Y AC]- 
[DEFYW]-x(2)-[STAP]-x(2)-[RGA] 

[ 1] Abedinia M., Layfield R., Jones S.M., Nixon P.P., Mattick J.S. Biochem. Biophys. Res. 
Commun. 183:1159-1166(1992).[ 2] Fletcher T.S., Kwee LL., Nakada T., Largman C, 
Martin B.M. Biochemistry 31:1892-1896(1992).[ 3] Sprenger G.A., Schorken U., Wiegert T., 
GroUe S., De Graaf A.A., Taylor S.V., Begley T.P., Bringer-Meyer S., Sahm H. Proc. Natl. 
Acad. Sci. U.S.A. 94:12857-12862(1997y r 4] Lindqvist Y., Schneider G., Ermler U., 
Sundstroem M. EMBO J. 11:2373-2379(1992). 

672. Transmembrane 4 family signature 
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Recently a number of eukaryotic cell surface antigens have been found to be evolutionary 
related [1,2,3]- The proteins known to belong to this family are listed below: - Mammalian 
antigen CD9 (MIC3); A protein involved in platelet activation and aggregation. - Mammalian 
leukocyte antigen CD37, expressed on B lymphocytes. - Mammalian leukocyte antigen CD53 
5 (OX-44), which may be involved in growth regulation in hematopoietic cells. - Mammalian 
lysosomal membrane protein CD63 (melanoma-associated antigen ME491; antigen ADl). - 
Mammalian antigen CD81 (cell surface protein TAPA-1), which may play an important role 
in the regulation of lymphoma cell growth. - Mammalian antigen CD82 (protein R2; antigen 
C33; Kangai 1 (KAIl)), which associates with CD4 or CDS and delivers costimulatory 

1 0 signals for the TCR/CD3 pathway. - Mammalian antigen CD151 (SFA-1; platelet-endothelial 
tetraspan antigen 3 (PETA-3)). - Mammalian cell surface glycoprotein A15 (TALLA-1; 
MXSl). - Mammalian novel antigen 2 (NAG-2). - Human tumor-associated antigen CO-029. 
- Schistosoma mansoni and japonicum 23 Kd surface antigen (SM23 / SJ23).These proteins 
share the following characteristics: they all seem to be type III membrane proteins (type III 

1 5 proteins are integral membrane proteins that contain a N-terminai membrane-anchoring 

domain which is not cleaved during biosynthesis and which functions both as a translocation 
signal and as a membrane anchor); they also contain three additional transmembrane regions, 
at least seven conserved cysteines residues, and are of approximately the same size (218 to 
284 residues). These proteins are collectively know as the 'transmembrane 4 super family' 

2 0 (TM4) because they span the plasma membrane four times. A schematic diagram of the 

domain structure of these proteins isshown below. +-+ + + + + + 

+ + I I TMa I Extra | TM21 Cyt | TM3 ] Extracellular | TM4 | Cyt| +-+- — 

+ +-— C C + CC C C— + C— -+ ********* Cyt: cytoplasmic 

domain. TMa : transmembrane anchor.TM2 to TM4: transmembrane regions 2 to 4.'C' : 

25 conserved cysteine. '*' : position of the pattern. 

A conserved region that includes two cysteines and seems to be located in a short 
cytoplasmic loop between two transmembrane domains has been selected as a signature for 
these proteins. 

Consensus pattern: G-x(3)-[LIVMF]-x(2)-[GSA]-[LIVMF](2)-G-C-x-[GA]-[STA]- x(2)- 
30 [EG]-x(2)-[CWN]-[LIVM](2) 

[ 1] Levy S., Nguyen V.Q., Andria M.L., Takahashi S. J. Biol. Chem. 266:14597- 
14602(1991).[ 2] Tomlinson M.G., Williams A.F., Wright M.D. Eur. J. Immunol. 23:136- 
40(1993).[ 3] Barclay A.N., Birkeland M.L., Brown M.H., Beyers A.D., Davis S.J., Somoza 
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C, Williams A.F. The leucocyte antigen factbooks. Academic Press, London / San Diego, 
(1993). 



5 673. Tryptophan synthase alpha chain signature 

Tryptophan synthase catalyzes the last step in the biosynthesis of tryptophan: the conversion 
of indoleglycerol phosphate and serine, totryptophan and glyceraldehyde 3-phosphate [1,2]. It 
has two functional domains: one for the aldol cleavage of indoleglycerol phosphate to indole 
andglyceraldehyde 3-phosphate and the other for the synthesis of tryptophan fromindole and 

1 0 serine. In bacteria and plants [3], each domain is found on a separate subunit (alpha and beta 
chains), while in fungi the two domains are fused together on a single multifunctional protein. 
A conserved region that contains three conserved acidic residues has been selected as a 
signature pattern for the alpha chain. The first and the third acidic residues are believed to 
serve as proton donors/acceptors in the enzyme's catalytic mechanism. 

1 5 Consensus pattern: [LIVM]-E-[LIVM]-G-x(2)-[FYC]-[ST]-[DE]-[PA]-[LIVMY]- [AGLI]- 
[DE]-G 

[ 1] Crawford LP. Annu. Rev. Microbiol. 43:567-600(1989). [ 2] Hyde C.C., Miles E.W. 
Bio/Technology 8:27-32(1990).[ 3] Berlyn M.B., Last R.L., Fink G.R. Proc. Natl. Acad. Sci. 
U.S.A. 86:4604-4608(1989). 



674. Tryptophan synthase beta chain pyridoxal-phosphate attachment site 

Tryptophan synthase catalyzes the last step in the biosynthesis of tryptophan: the conversion 

of indoleglycerol phosphate and serine, totryptophan and glyceraldehyde 3-phosphate [1,2]. It 

25 has two functional domains: one for the aldol cleavage of indoleglycerol phosphate to indole 
andglyceraldehyde 3-phosphate and the other for the synthesis of tryptophan fromindole and 
serine. In bacteria and plants [3], each domain is found on a separate subunit (alpha and beta 
chains), while in fungi the two domains arefused together on a single multifunctional protein. 
The beta chain of the enzyme requires pyridoxal-phosphate as a cofactor. The pyridoxal- 

30 phosphate group is attached to a lysine residue. The region around this lysine residue also 
contains two histidine residues which are part of the pyridoxal-phosphate binding site. The 
signature pattern for the tryptophansynthase beta chain is derived from that conserved region. 
-Consensus pattern: [LIVM]-x-H-x-G-[STA]-H-K-x-N [K is the pyridoxal-P attachment site] 
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[ 1] Crawford LP. Annu. Rev. Microbiol. 43:567-600(1989).[ 2] Hyde C.C., Miles E.W. 
Bio/Technology 8:27-32(1990). [ 3] Berlyn M.B., Last R.L., Fink G.R. Proc. Natl. Acad. Sci. 
U.S.A. 86:4604-4608(1989). 

5 

675. Serine proteases, trypsin family, active sites 

The catalytic activity of the serine proteases from the trypsin family is provided by a charge 
relay system involving an aspartic acid residue hydrogen-bonded to a histidine, which itself is 
hydrogen-bonded to a serine. The sequences in the vicinity of the active site serine and 

1 0 histidine residues are well conserved in this family of proteases [1]. A partial list of proteases 
known to belong to the trypsin family is shown below. - Acrosin. - Blood coagulation factors 
VII, IX, X, XI and XII, thrombin, plasminogen, and protein C. - Cathepsin G. - 
Chymotrypsins. - Complement components Clr, Cls, C2, and complement factors B, D and 
I. - Complement-activating component of RA-reactive factor. - Cytotoxic cell proteases 

15 (granzymes A to H). - Duodenase 1. - Elastases 1, 2, 3 A, 3B (protease E), leukocyte 

(medullasin). - Enterokinase (EC 3.4.21.9 ) (enteropeptidase). - Hepatocyte growth factor 
activator. - Hepsin. - Glandular (tissue) kallikreins (including EGF-binding protein types A, 
B, and C, NGF-gamma chain, gamma-renin, prostate specific antigen (PSA) and tonin). - 
Plasma kallikrein. - Mast cell proteases (MCP) 1 (chymase) to 8. - Myeloblastin (proteinase 

2 0 3) (Wegener's autoantigen). - Plasminogen activators (urokinase-type, and tissue-type). - 

Trypsins I, II, III, and IV. - Tryptases. - Snake venom proteases such as ancrod, batroxobin, 
cerastobin, flavoxobin, and protein C activator. - CoUagenase from common cattle grub and 
collagenolytic protease from Atlantic sand fiddler crab. - Apolipoprotein(a). - Blood fluke 
cercarial protease. - Drosophila trypsin like proteases: alpha, easter, snake-locus. - Drosophila 
25 protease stubble (gene sb). - Major mite fecal allergen Der p III. All the above proteins 

belong to family SI in the classification of peptidases[2,El] and originate from eukaryotic 
species. It should be noted thatbacterial proteases that belong to family S2A are similar 
enough in the regions of the active site residues that they can be picked up by the same 
patterns. These proteases are listed below. - Achromobacter lyticus protease I. - Lysobacter 

3 0 alpha-lytic protease. - Streptogrisin A and B (Streptomyces proteases A and B). - 

Streptomyces griseus glutamyl endopeptidase II. - Streptomyces fradiae proteases 1 and 2. 
Consensus pattern: [LIVM]-[ST]-A-[STAG]-H-C [H is the active site residue] 
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Consensus pattern: [DNSTAGC]-[GSTAPIMVQH]-x(2)-G-[DE]-S-G-[GS]-[SAPHV]- 
[LIVMFYWH]-[LIVMFYSTANQH] [S is the active site residue] 

[ 1] Brenner S. Nature 334:528-530(1988).[ 2] Rawlings N.D., Barrett A.J. Meth. Enzymol. 
244:19-61(1994).[E1] 

676. (tsp) Thrombospondin type 1 domain 
[1] Bork P; FEBS lett 1993;327:125-130. 

677. Tubulin subunits alpha, beta, and gamma signature 

Tubulins [1,2], the major constituent of microtubules are dimeric proteins which consist of 
two closely related subunits (alpha and beta). Tubulin binds two molecules of GTP at two 
different sites (N and E). At the E (Exchangeable) site, GTP is hydrolyzed during 
incorporation into the microtubule. Near the E site is an invariant region rich in glycines 
which is found in both chains andwhich is now [3] said to control the access of the nucleotide 
to its binding site. A signature pattern was developed from this region. With the exception of 
the simple eukaryotes, most species express a variety of closely related alpha and beta 
isotypes. In most species there is a third member of the tubulin family: gamma tubulin. 
Gamma tubulin is found at microtubule organizing centers (MTOC) such as the spindle poles 
or the centrosome, suggesting that it is involved in the minus-end nucleation of microtubule 
assembly [4]. 

Consensus pattern: [SAG]-G-G-T-G-[SA]-G 

[ 1] Cleveland D.W., Sullivan K.F. Annu. Rev. Biochem. 54:331-365(1985).[ 2] Joshi H.C., 
Cleveland D.W. Cell Motil. Cytoskeleton 16:159-163(1990).[ 3] Hesse J., Thierauf M., 
Ponstingl H. J. Biol. Chem. 262:15472-15475(1987).[ 4] Joshi H.C. BioEssays 15:637- 
643(1993). 

Tubulin-beta mRNA autoregulation signal 

The stability of beta-tubulin mRNAs are autoregulated by their own translation product [1]. 
Unpolymerized tubulin subunits bind directly (or activate a factor(s) which binds co- 
translationally) to the nascent N-terminus of beta-tubulin. This binding is transduced through 
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the adjacent ribosomes to activatean RNAse that degrades the polysome-bound mRNA. The 
recognition element has been shown to be the first four amino acids of beta-tubulin: Met-Arg- 
Glu-Ile. Mutations to this sequence abolish the autoregulation effect (except for the 
replacement of Glu by Asp); transposition of this sequence to an internal region of a 
polypeptide also suppresses the autoregulatory effect. 
Consensus pattern; <M-R-[DE]-[IL] 

[ 1] Cleveland D.W. Trends Biochem. Sci. 13:339-343(1988). 

678. (tRNA-synt 2c) Aminoacyl -transfer RNA synthetases class-II signatures. Aminoacyl- 
tRNA synthetases (EC 6.1.1.-) [1] are a group of enzymes which activate amino acids and 
transfer them to specific tRNA molecules as the first step in protein biosynthesis. In 
prokaryotic organisms there are at least twenty different types of aminoacyl -tRNA 
synthetases, one for each different amino acid. In eukaryotes there are generally two 
aminoacyl-tRNA synthetases for each different amino acid: one cytosolic form and a 
mitochondrial form. While all these enzymes have a common function, they are widely 
diverse in terms of subunit size and of quaternary structure. The synthetases specific for 
alanine, asparagine, aspartic acid, glycine, histidine, lysine, phenylalanine, proline, serine, 
and threonine are referred to as class-II synthetases [2 to 6] and probably have a common 
folding pattern in their catalytic domain for the binding of ATP and amino acid which is 
different to the Rossmann fold observed for the class I synthetases [7]. Class-II tRNA 
synthetases do not share a high degree of similarity, however at least three conserved regions 
are present [2,5,8]. Signature patterns have been derived from two of these regions. 

Consensus pattern: [FYH]-R-x-[DE]-x(4,12)-[RH]-x(3)-F-x(3)-[DE]- 

Consensus pattern: [GSTALVF]-{DENQHRKP}-[GSTA]-[LIVMF]-[DE]-R-[LIVMF]-x- 

[LIVMSTAG]-[LIVMFY]- 

[ 1] Schimmel P. Annu. Rev. Biochem. 56:125-158(1987).[ 2] Delarue M., Moras D. 
BioEssays 15:675-687(1993).[ 3] Schimmel P. Trends Biochem. Sci. 16:1-3(1991).[ 4] Nagel 
G.M., Doolittle R.F. Proc. Natl. Acad. Sci. U.S.A. 88:8121-8125(1991). [ 5] Cusack S., 
Haertlein M., Leberman R. Nucleic Acids Res. 19:3489-3498(1991). [ 6] Cusack S. 
Biochimie 75:1077-1081(1993).[ 7] Cusack S., Berthet-Colominas C, Haertlein M., Nassar 
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N., Leberman R. Nature 347:249-255(1990).[ 8] Leveque F., Plateau P., Dessen P., Blanquet 
S. Nucleic Acids Res. 18:305-312(1990). 

679. UBA-domain 

The UBA-domain (ubiquitin associated domain) is a novel sequence motif found in 
several proteins having connections to ubiquitin and the ubiquitination pathway. The 
structure of the UBA domain consists of a compact three helix bundle [1]. Number of 
members: 84 

[1] Structure of a human DNA repair protein UBA domain that interacts with HIV-1 
Vpr. Dieckmann T, Withers-Ward ES, Jarosinski MA, Liu CF, Chen IS, Feigon J; Nat Struct 
Biol 1998;5:1042-1047. 

680. UBX domain 

Domain present in ubiquitin-regulatory proteins. Present in FAFl and Shplp.Number of 
members: 19 

[1] The UBA domain: a sequence motif present in multiple enzyme classes of the 
ubiquitination pathway. Hofmann K, Bucher P; Trends Biochem Sci 1996;21:172-173. 

681. (UCH) Ubiquitin carboxyl-terminal hydrolases family 1 cysteine active site 
Ubiquitin carboxyl-terminal hydrolases (UCH) (deubiquitinating enzymes) [1,2] are thiol 
proteases that recognize and hydrolyze the peptide bond at the C-terminal glycine of 
ubiquitin. These enzymes are involved in the processing of poly-ubiquitin precursors as well 
as that of ubiquinated proteins. There are two distinct families of UCH. The first class consist 
of enzymes ofabout 25 Kd and is currently represented by: - Mammalian isozymes LI and 
L3. - Yeast YUHl. - Drosophila Uch.One of the active site residues of class-I UCH [3] is a 
cysteine. A signature pattern has been derived from the region around that residue. 
Consensus pattern: Q-x(3)-N-[SA]-C-G-x(3)-[LIVM](2)-H-[SA]-[LR/M]-[SA] [C is the 
active site residue 

[ 1] Jentsch S., Seufert W., Hauser H.-P. Biochim. Biophys. Acta 1089:127-139(1991).[ 2] 
D'andrea A., Pellman D. Crit. Rev. Biochem. Mol. Biol. 33:337-352(1998). [ 3] Johnston 
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S.C., Larsen C.N., Cook W.J., Wilkinson K.D., Hill CP. EMBO J. 16:3787-3796(1997).[ 4] 
Rawlings N.D., Barrett A.J. Meth. Enzymol. 244:461-486(1994). 

5 682. Ubiquitin carboxyl-terminal hydrolases family 2 signatures (UCH-1) 

Ubiquitin carboxyl-terminal hydrolases (UCH) (deubiquitinating enzymes) [1,2] are thiol 
proteases that recognize and hydrolyze the peptide bond at the C-terminal glycine of 
ubiquitin. These enzymes are involved in the processing of poly-ubiquitin precursors as well 
as that of ubiquinated proteins. There are two distinct families of UCH. The second class 

1 0 consist of largeproteins (800 to 2000 residues) and is currently represented by: - Yeast UBPl, 
UBP2, UBP3, UBP4 (or DOA4/SSV7), UBP5, UBP7, UBP9, UBPIO, UBPll, UBP12, 
UBP13, UBP14, UBP15 and UBP16. - Human tre-2. - Human isopeptidase T. - Human 
isopeptidase T-3. - Mammalian Ode-1. - Mammalian Unp. - Mouse Dub-1. - Drosophila fat 
facets protein (gene faf). - Mammalian faf homolog. - Drosophila D-Ubp-64E. - 

15 Caenorhabditis elegans hypothetical protein R10E11.3. - Caenorhabditis elegans hypothetical 
protein K02C4.3.These proteins only share two regions of similarity. The first region 
containsa conserved cysteine which is probably implicated in the catalytic mechanism. The 
second region contains two conserved histidines residues, one of which is also probably 
implicated in the catalytic mechanism. Signature patterns for both conserved regions have 

2 0 been developed. 

Consensus pattern: G-[LIVMFY]-x(l,3)-[AGC]-[NASM]-x-C-[FYW]-[LIVMC]-[NST]- 
[SACV]-x-[LIVMS]-Q [C is the putative active site residue] 

Consensus pattern: Y-x-L-x-[SAG]-[LIVMFT]-x(2)-H-x-G-x(4,5)-G-H-Y [The two H's are 
putative active site residues] 
25 [1] Jentsch S., Seufert W., Hauser H.-P. Biochim. Biophys. Acta 1089:127-139(1991).[ 2] 
D'andrea A., Pellman D. Crit. Rev. Biochem. Mol. Biol. 33:337-352(1998).[ 3] Rawlings 
N.D., Barrett A.J. Meth. Enzymol. 244:461-486(1994). 

3 0 683. Ubiquitin carboxyl-terminal hydrolases family 2 signatures (UCH-2) 

Ubiquitin carboxyl-terminal hydrolases (UCH) (deubiquitinating enzymes) [1,2] are thiol 
proteases that recognize and hydrolyze the peptide bond at the C-terminal glycine of 
ubiquitin. These enzymes are involved in the processing of poly-ubiquitin precursors as well 
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as that of ubiquinated proteins. There are two distinct families of UCH. The second class 
consist of largeproteins (800 to 2000 residues) and is currently represented by: - Yeast UBPl, 
UBP2, UBP3, UBP4 (or DOA4/SSV7), UBP5, UBP7, UBP9, UBPIO, UBPll, UBP12, 
UBP13, UBP14, UBP15 and UBP16. - Human tre-2. - Human isopeptidase T. - Human 
5 isopeptidase T-3. - Mammalian Ode-1. - Mammalian Unp. - Mouse Dub-1. - Drosophila fat 
facets protein (gene faf). - Mammalian faf homolog. - Drosophila D-Ubp-64E. - 
Caenorhabditis elegans hypothetical protein R10E11.3. - Caenorhabditis elegans hypothetical 
protein K02C4.3. These proteins only share two regions of similarity. The first region 
containsa conserved cysteine which is probably implicated in the catalytic mechanism. The 
1 0 second region contains two conserved histidines residues, one of which is also probably 
implicated in the catalytic mechanism. Signature patterns for both conserved regions have 
been developed. 

Consensus pattern: G-[LIVMFY]-x(l,3)-[AGC]-[NASM]-x-C-[FYW]-[LIVMC]-[NST]- 
[SACV]-x-[LIVMS]-Q [C is the putative active site residue] 
1 5 Consensus pattern: Y-x-L-x-[SAG]-[LIVMFT]-x(2)-H-x-G-x(4,5)-G-H-Y [The two H's are 
putative active site residues] 

[ 1] Jentsch S., Seufert W., Hauser H.-P. Biochim. Biophys. Acta 1089:127-139(1991).[ 2] 
D'andrea A., Pellman D. Crit. Rev. Biochem. Mol. Biol. 33:337-352(1998). [ 3] Rawlings 
N.D., Barrett A.J. Meth. Enzymol. 244:461-486(1994). 

20 

684. UDP-glycosyltransferases signature 

UDP glycosyltransferases (UGT) are a superfamily of enzymes that catalyzes the addition of 
the glycosyl group from a UTP-sugar to a small hydrophobic molecule. This family currently 

2 5 consist of: - Mammalian UDP-glucoronosyl transferases (UDPGT) [1,2]. A large family of 
membrane-bound microsomal enzymes which catalyze the transfer of glucuronic acid to a 
wide variety of exogenous and endogenous lipophilic substrates. These enzymes are of major 
importance in the detoxification and subsequent elimination of xenobiotics such as drugs and 
carcinogens. - A large number of putative UDPGT from Caenorhabditis elegans. - 

30 Mammalian 2-hydroxyacylsphingosine 1-beta-galactosyltransferase [3] (also known as UDP- 
galactose-ceramide galactosyltransferase). This enzyme catalyzes the transfer of galactose to 
ceramide, a key enzymatic step in the biosynthesis of galactocerebrosides, which are 
abundant sphingolipids of the myelin membrane of the central nervous system and peripheral 
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nervous system. - Plants flavonol 0(3)-glucosy]transf erase. An enzyme [4] that catalyzes the 
transfer of glucose from UDP-glucose to a flavanol. This reaction is essential and one of the 
last steps in anthocyanin pigment biosynthesis. - Baculoviruses ecdysteroid UDP- 
glucosyltransferase (EC 2.4.1.-) [5] (egt). This enzyme catalyzes the transfer of glucose from 
5 UDP-glucose to ectysteroids which are insect molting hormones. The expression of egt in the 
insect host interferes with the normal insect development by blocking the molting process, - 
Prokaryotic zeaxanthin glucosyl transferase (gene crtX), an enzyme involved in carotenoid 
biosynthesis and that catalyses the glycosylation reaction which converts zeaxanthin to 
zeaxanthin-beta- diglucoside. - Streptomyces macrolide glycosyltransferases [6]. These 

1 0 enzymes specifically inactivates macrolide anitibiotics via 2'-0-glycosylation using UDP- 

glucose.These enzymes share a conserved domain of about 50 amino acid residues locatedin 
their C-terminal section and from which a pattern has been extracted todetect them. 
Consensus pattern: [FW]-x(2)-Q-x(2)-[LIVMYA]-[LIMV]-x(4,6)-[LVGAC]-[LVFYA]- 
[LIVMF]-[STAGCM]-[HNQ]-[STAGC]-G-x(2)-[STAG]-x(3)-[STAGL]- [LIVMFA]-x(4)- 

15 [PQR]-[LIVMT]-x(3)-[PA]-x(3)-[DES]-[QEHN] 

[ 1] Button G.J. (In) Glucoro nidation of drugs and other compounds, Button G.J., Ed., pp 1- 
78, CRC Press, Boca Raton, (1980).[ 2] Burchell B., Nebert B.W., Nelson B.R., Bock K.W., 
lyanagi T., Jansen P.L., Lancet B., Mulder G.J., Chowdhury J.R., Siest G., Tephly T.R., 
Mackenzie P.I. DNA Cell Biol. 10:487-494(1991).[ 3] Schulte S., Stoffel W. Proc. Natl. 

2 0 Acad. Sci. U.S.A. 90:10265-10269(1993). [ 4] Furtek B., Schiefelbein J.W., Johnston F., 

Nelson O.E. Jr. Plant Mol. Biol. 11:473-481(1988).[ 5] O'Reilly B.R., Miller L.K. Science 
245:1110-1112(1989).[ 6] Hernandez C, Olano C, Mendez C, Salas J.A. Gene 134:139- 
140(1993). 

25 

685. UDP-glucose/GBP-mannose dehydrogenase family 

The UBP-glucose/GBP-mannose dehydrogenaseses are a small group of enzymes 
which possesses the ability to catlyze the NAB-dependent 2-fold oxidation of an alcholol to 
an acid without the release of an aldehyde intermediate [2]. Number of members: 55 

3 0 [1] Purification and characterization of guanosine diphospho-B-mannose 

dehydrogenase. A key enzyme in the biosynthesis of alginate by Pseudomonas aeruginosa. 
Roychoudhury S, May TB, Gill JF, Singh SK, Feingold BS, Chakrabarty AM; J Biol Chem 
1989;264:9380-9385. [2] Properties and kinetic analysis of UDP-glucose dehydrogenase 
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from group A streptococci. Irreversible inhibition by UDP-chloroacetol. Campbell RE, Sala 
RF, van de Rijn I, Tanner ME; J Biol Chem 1997;272:3416-3422. 



5 686. Uracil-DNA glycosylase signature 

Uracil-DNA glycosylase (EC 3.2.2.-) (UNG) [1] is a DNA repair enzyme that excises uracil 
residues from DNA by cleaving the N-glycosylic bond. Uracil in DNA can arise as a result of 
misincorportation of dUMP residues by DNA polymerase or deamination of cytosine. The 
sequence of uracil-DNA glycosylase is extremely well conserved [2] in bacteria and 

10 eukaryotes as well as in herpes viruses. More distantly related uracil-DNA glycosylases are 
also found in poxviruses [3]. In eukaryotic cells, UNG activity is found in both the nucleus 
and the mitochondria. Human UNGl protein is transported to both the mitochondria and the 
nucleus [4]. The N-terminal 77 amino acids of UNGl seem to be required for mitochondrial 
localization [4], but the presence of a mitochondrial transitpeptide has not been directly 

15 demonstrated. As a signature for this type of enzyme, the most N-termina conserved region 
has been selected. This region contains an aspartic acid residue which has been proposed, 
based on X-ray structures [5,6] to act as a general base in the catalytic mechanism. 
Consensus pattern: [KR]-[LIV]-[LIVC]-[LIVM]-x-G-[QI]-D-P-Y [D is the active site 
residue] - 

2 0 [1] Sancar A., Sancar G.B. Annu. Rev. Biochem. 57:29-67(1988).[ 2] Olsen L.C., Aasland 

R., Wittwer C.U., Krokan H.E., Helland D.E. EMBO J. 8:3121-3125 (1989).[ 3] Upton C, 
Stuart D.T., McFadden G. Proc. Natl. Acad. Sci. U.S.A. 90:4518-4522(1993).[ 4] Slupphaug 
G., Markussen F.-H., Olsen L.C., Aasland R., Aarsaether N., Bakke O., Krokan H.E., Helland 
D.E. Nucleic Acids Res. 21:2579-2584(1993).[ 5] Savva R., McAuley-Hecht K., Brown T., 
2 5 Pearl L. Nature 373:487-493(1995).[ 6] Mol CD., Arvai A.S., Slupphaug G., Kavli B., 
Alseth I., Krohan H.E., Tainer J.A. Cell 80:869-878ri995V [ 7] Muller S.J., Caradonna S. 
Biochim. Biophys. Acta 1088:197-207(1991).[ 8] Meyer-Siegler K., Mauro D.J., Seal G., 
Wurzer J., Deriel J.K., Sirover M.A. Proc. Natl. Acad. Sci. U.S.A. 88:8460-8464(1991).[ 9] 
Muller S.J., Caradonna S. J. Biol. Chem. 268:1310-1319(1993).[10] Barnes D.E., Lindahl T., 

3 0 Sedgwick B. Curr. Opin. Cell Biol. 5:424-433(1993). 
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The following uncharacterized proteins have been shown [1] to share regions ofsimilarities: - 
Yeast chromosome II hypothetical protein YBL036c. - Caenorhabditis elegans hypothetical 
protein F09E5.8. - Bacillus subtilis hypothetical protein ylmE. - Escherichia coli hypothetical 
protein yggS and HI0090, the corresponding Haemophilus influenzae protein. - Helicobacter 
pylori hypothetical protein HP0395. - Mycobacterium tuberculosis hypothetical protein 
MtCY270.20. - Synechocystis strain PCC 6803 hypothetical protein slr0556. - A 
Pseudomonas aeruginosa hypothetical protein in pilT 5'region. - A Vibrio alginolyticus 
hypothetical protein in pilT 5'region. These are proteins of from 25 to 30 Kd which contain a 
number of conserved regions. The best conserved region which is located in the first third of 
these proteins has been selected as a signature pattern. 

Consensus pattern: [FW]-H-[FM]-[IV]-G-x-[LIV]-Q-x-[NK:R]-K-x(3)-[LIV] 
[ 1] Bairoch A., Rudd K.E. Unpublished observations (1996). 

688. Uncharacterized protein family UPF0003 signature 

The following uncharacterized proteins have been shown [1] to share regions of similarities: - 
Escherichia coli protein aefA. - Escherichia coli hypothetical protein yggB. - Escherichia coli 
hypothetical protein yjeP and HI0195.1, the corresponding Haemophilus influenzae protein. - 
Escherichia coli hypothetical protein ynal. - Bacillus subtilis hypothetical protein yhdY. - 
Helicobacter pylori hypothetical protein HP0415. - Synechocystis strain PCC 6803 
hypothetical protein slr0639. - Aichaeoglobus fulgidus hypothetical protein AF1546. - 
Methanococcus jannaschii hypothetical protein MJ0170. - Methanococcus janr^schii 
hypothetical protein MJ1143.The size of these proteins range from 30 to 120 Kd. They all 
contain a number of transmembrane regions. The best conserved region which is located in 
and just after the last potential transmembrane region has been selected as a signature 
pattern,. 

Consensus pattern: G-[STIF]-V-x(2)-[LIVM]-x(6)-[LIVMF]-x(3)-[DQ]-x(3)-[LIV]- x-[LIV]- 

P-N-x(2)-[LIVMF]-[LIVFSTA]-x(5)-N 

[ 1] Bairoch A. Unpublished observations (1997). 

689. Uncharacterized protein family UPF0004 signature 
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The following uncharacterized proteins have been shown [1] to share regions of similarities; - 
Escherichia coli hypothetical protein yliG. - Escherichia coli hypothetical protein yleA and 
HI0019, the corresponding Haemophilus influenzae protein. - Bacillus subtilis hypothetical 
protein yqeV. - Helicobacter pylori hypothetical protein HP0269. - Helicobacter pylori 
hypothetical protein HP0285. - Mycoplasma iowae hypothetical protein in 16S RNA 
5'region. - Mycobacterium leprae hypothetical protein B2235_C2_195. - Pseudomonas 
aeruginosa hypothetical protein in hemL 3'region. - Synechocystis strain PCC 6803 
hypothetical protein slr0082. - Synechocystis strain PCC 6803 hypothetical protein sll0996. - 
Methanococcus jannaschii hypothetical protein MJ0865. - Methanococcus jannaschii 
hypothetical protein MJ0867. - Caenorhabditis elegans hypothetical protein F25B5.5.The size 
of these proteins range from 47 to 61 Kd. They contain six conserved cysteines, three of 
which are clustered in a region that can be used as asignature pattern. 

Consensus pattern: [LIVM]-x-[LIVMT]-x(2)-G-C-x(3)-C-[STAN]-[FY]-C-x-[LIVM]- x(4)- 
G 

[1] Bairoch A. Unpublished observations (1997). 

690. Uncharacterized protein family UPF0005 signature 

The following proteins seems to be evolutionary related [1]: - Mammalian protein TEGT 
(Testis Enhanced Gene Transcript). - Escherichia coli hypothetical protein yccA and HI0044, 
the corresponding Haemophilus influenzae protein. - A probable Pseudomonas aeruginosa 
ortholog of yccA. These are proteins of about 25 Kd which seem to contain seven 
transmembranedomains. A signature pattern that corresponds to a region that starts with the 
beginning of the third transmembrane domain and ends in the middle of the fourth one has 
been developed. 

Consensus pattern: G-[LIVM](2)-[SA]-x(5,8)-G-x(2)-[LIVM]-G-P-x-L-x(4)-[SAG]- x(4,6)- 
[LIVM](2)-x(2)-A-x(3)-T-A-[LIVM](2)-F 

[ 1] Walter L., Marynen P., Szpirer J., Levan G., Guenther E. Genomics 28:301-304(1995). 

691. Uncharacterized protein family UPF0006 signatures 

The following uncharacterized proteins have been shown [1] to share regions of similarities: - 
Yeast chromosome II hypothetical protein YBL055c. - Escherichia coli hypothetical protein 
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ycfH and HI0454, the corresponding Haemophilus influenzae protein. - Escherichia coli 
hypothetical protein yigW. - Escherichia coli hypothetical protein yjjV and H10081, the 
corresponding Haemophilus influenzae protein. - Bacillus subtilis hypothetical protein yabD. 
- Haemophilus influenzae hypothetical protein HI1664. - Mycoplasma genitalium 
hypothetical protein MG009. These are proteins of from 24 to 47 Kd which contain a number 
of conserved regions. They can be picked up in the database by the following patterns. 
Consensus pattern: [LIVMFY](2)-D-[STA]-H-x-H-[LIVMF]-[DN 
Consensus pattern: P-[LIVM]-x-[LIVM]-H-x-R-x-[TA]-x-[DE 

Consensus pattern: [LVSA]-[LIVA]-x(2)-[LIVM]-[PS]-x(3)-L-[LIVM]-[LIVMS]-E-T- D-x- 
P 

[ 1] Bairoch A., Rudd K.E. Unpublished observations (1995). 

692. Uncharacterized protein family UPF0007 signature 

The following proteins seems to be evolutionary related [1]: - Escherichia coli hypothetical 
protein ygbP and HI0672, the corresponding Haemophilus influenzae protein. - Bacillus 
subtilis hypothetical protein yacM. - Mycobacterium tuberculosis hypothetical protein 
MtCY06Gl 1.29c. - Synechocystis strain PCC 6803 hypothetical protein slr0951. - A 
Rhodobacter capsulatus hypothetical protein in nifR3 5'region. Except for the Rhodobacter 
protein which contains a C-terminal extension, all these proteins have from 225 to 236 amino 
acids. They are hydrophilic proteins that can be picked up in the database by the following 
pattern. 

Consensus pattern: V-L-[IV]-H-D-[GA]-A-R 
[ 1] Bairoch A. Unpublished observations (1997). 

693. Uncharacterized protein family UPF0015 signature 

The following uncharacterized proteins have been shown [1] to share regions of similarities: - 
Yeast chromosome II hypothetical protein YBR002c. - Yeast chromosome XIII hypothetical 
protein YMRlOlc. - Escherichia coli hypothetical protein yaeU and HI0920, the 
corresponding Haemophilus influenzae protein. - Helicobacter pylori hypothetical protein 
HP1221. - Mycobacterium leprae hypothetical protein B1937_F2_65. - A Corynebacterium 
glutamicum hypothetical protein in aroF 3'region. - A Streptomyces fradiae hypothetical 
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protein in transposon Tn4556. - Synechocystis strain PCC 6803 hypothetical protein slI0505. 
- Methanococcus jannaschii hypothetical protein MJ1372.These are proteins of about 26 to 
40 Kd whose central region is well conserved. They can be picked up in the database by the 
following pattern. 

Consensus pattern: [DE]-[LIVMF](3)-R-T-[SG]-G-x(2)-R-x-S-x-[FY]-[LIVM](2)-W-Q- 
[ 1] Wolfe K.H., Lohan A.J.E. Yeast 10:S41-S46(1994). 

694. Uncharacterized protein family UPF0016 signature 

The following uncharacterized proteins have been shown [1] to share regions of similarities: - 
Yeast hypothetical protein YBR187w. - Fission yeast hypothetical protein SpAC17G8.08c. - 
Mouse protein pFT27. - Synechocystis strain PCC 6803 hypothetical protein sll0615. These 
are hydrophobic proteins of 200 to 320 amino acids that seem to contain six or seven 
transmembrane domains. A conserved region which seems, in the eukaryotic proteins of this 
family, to directly follow the second transmembrane domain has been selected as a signature 
pattern. 

Consensus pattern: E-[LIVM]-G-D-K-T-F-[LIVMF](2)-A- 
[ 1] Bairoch A. Unpublished observations (1996). 

695. Uncharacterized protein family UPF0021 signature 

The following uncharacterized proteins have been shown [1] to share regions of similarities: - 
Yeast chromosome VII hypothetical protein YGL211w. - Dictyostelium discoideum protein 
vegl36. - Methanococcus jannaschii hypothetical proteins MJ1157 and MJ1478. These are 
proteins of from 300 to 36o residues. They can be picked up in thedatabase by the following 
pattern which is located in their N-terminalsection. 
Consensus pattern: C-K-x(2)-F-x(4)-E-x(22,23)-S-G-G-K-D 
[ 1] Bairoch A. Unpublished observations (1997). 

696. Uncharacterized protein family UPF0023 signature 

The following uncharacterized proteins have been shown [1] to share regions of similarities: - 
Mouse protein 22A3. - Yeast chromosome XII hypothetical protein YLR022c. - 
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Caenorhabditis elegans hypothetical protein W06E11.4. - Methanococcus jannaschii 
hypothetical protein MJ0592.These are hydrophilic proteins of about 30 Kd. They can be 
picked up in the database by the following pattern. 
Consensus pattern: D-x-D-E-[LIV]-L-x(4)-V-F-x(3)-S-K-G- 
[ 1] Bairoch A. Unpublished observations (1997). 

697. Uncharacterized protein family UPF0024 signature. The following uncharacterized 
proteins have been shown [1] to share regions of similarities: - Escherichia coli hypothetical 
protein ygbO and HI0701, the corresponding Haemophilus influenzae protein. - Helicobacter 
pylori hypothetical protein HP0926. - Yeast chromosome XV hypothetical protein YOR243c. 
- Caenorhabditis elegans hypothetical protein B0024.il. - Methanococcus jannaschii 
hypothetical proteins MJ0588 and MJ1364.These are hydrophilic proteins of from 39 to 77 
Kd. They can be picked up in the database by the following pattern. 

Consensus pattern: G-x-K-D-[KR]-x-A-[LV]-T-x-Q-x-[LIVF]-[SGC]- 

[ 1] Bairoch A. Unpublished observations (1997). 

698. Uncharacterized protein family UPF0025 signature 

The following uncharacterized proteins have been shown [1] to share regions of similarities: - 
Escherichia coli hypothetical protein yfcE. - Bacillus subtilis hypothetical protein ysnB. - 
Mycoplasma genitalium and pneumoniae hypothetical protein MG207. - Methanococcus 
jannaschii hypothetical proteins MJ0623 and MJ0936. These are hydrophilic proteins of 
about 20 Kd. They can be picked up in thedatabase by the following pattern. 
Consensus pattern: D-V-[LIV]-x(2)-G-H-[ST]-H-x(12)-[LIVMF]-N-P-G 
[ 1] Bairoch A. Unpublished observations (1997). 

699. Uncharacterized protein family UPF0029 signature 

The following uncharacterized proteins have been shown [1] to share regions of similarities: - 
Yeast chromosome III hypothetical protein YCR59c. - Yeast chromosome IV hypothetical 
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protein YDL177C. - Escherichia coli hypothetical protein yigZ and HI0722, the 
corresponding Haemophilus influenzae protein. - Bacillus subtilis hypothetical protein yvyE. 
- A Thermus aquaticus hypothetical protein in pel 5'region. These proteins can be picked up 
in the database by the following pattern. 

Consensus pattern: G-x(2)-[LIVM](2)-x(2)-[LIVM]-x(4)-[LIVM]-x(5)-[LIVM](2)-x- R- 
[ FYW] (2)-G-G-x(2)- [LIVM] -G 

[ 1] Koonin E.V., Bork P., Sander C. EMBO J, 13:493-503(1994). 

700. Uncharacterized protein family UPF0030 signature 

The following uncharacterized proteins have been shown [1] to be highly similar: - Yeast 
chromosome VI hypothetical protein YFL060c. - Yeast chromosome XIII hypothetical 
protein YMR095c. - Yeast chromosome XIV hypothetical protein YNL334c. - Bacillus 
subtilis hypothetical protein yaaE. - Haemophilus influenzae hypothetical protein HI1648. - 
Methanococcus jannaschii hypothetical protein MJ1661. These are hydrophilic proteins of 
about 19 to 25 Kd. They can be picked up inthe database by the following pattern. 
Consensus pattern: [GA]-L-I-[LIV]-P-G-G-E-S-T-[STA] 
[ 1] Bairoch A. Unpublished observations (1997). 

701. Uncharacterized protein family UPF0032 signature 

The following uncharacterized proteins have been shown [1] to share regions of similarities: - 
Escherichia coli hypothetical protein yigU and HI0188, the corresponding Haemophilus 
influenzae protein. - Bacillus subtilis hypothetical protein ycbT. - Mycobacterium 
tuberculosis hypothetical protein MtCY49.33c and U2126A, the corresponding 
Mycobacterium leprae protein. - Synechocystis strain PCC 6803 hypothetical protein sll0194. 
- Odontella sinensis and Porphyra purpurea chlroplast hypothetical protein ycf43. These 
proteins have from 245 to 317 amino acids and seem to contain at least six or seven 
transmembrane regions. A conserved region located in the central section of these proteins 
has been developed as a signature pattern,. 

Consensus pattern: Y-x(2)-F-[LIVMA](2)-x-L-x(4)-G-x(2)-F-[EQ]-[LIVMF]-P- [LIVM] - 
[ 1] Bairoch A., Rudd K.E. Unpublished observations (1996). 
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702. Uncharacterized protein family UPF0034 signature 

The following uncharacterized proteins have been shown [1] to share regions of similarities: - 
Escherichia coli hypothetical protein yhdG and HI0979, the corresponding Haemophilus 
5 influenzae protein. - Escherichia coli hypothetical protein yjbN and HI0634, the 

corresponding Haemophilus influenzae protein. - Escherichia coli hypothetical protein yohl 
and HI0270, the corresponding Haemophilus influenzae protein. - Bacillus subtilis 
hypothetical protein yacF. - Rhodobacter capsulatus protein nifR3 and related proteins in 
Azospirillum brasilense and Rhizobium leguminosarum. - Synechocystis strain PCC 6803 

10 hypothetical protein slr0644. - Synechocystis strain PCC 6803 hypothetical protein sll0926. - 
Caenorhabditis elegans hypothetical protein C45G9.2. - Yeast protein SMMl. - Yeast 
hypothetical protein YLR401c. - Yeast hypothetical protein YLR405w. - Yeast hypothetical 
protein YMLOBOw. Although it has been proposed [2] that Rhodobacter capsulatus nifR3 is a 
transcriptional regulatory protein, it is believed that these proteins constitute a family of 

15 enzymes whose active site could include a conserved cysteine which has been used as the 
central part of a signature pattern. 

Consensus pattern: [LIVM]-[DNG]-[LIVM]-N-x-G-C-P-x(3)-[LIVMASQ]-x(5)-G-[SAC] 
[ 1] Bairoch A., Rudd K.E. Unpublished observations (1995). [ 2] Foster-Hartnett D., Cullen 
P.J., Gabbert K.K., Kranz R.G. Mol. Microbiol. 8:903-914(1993). 

20 

703. Uncharacterized protein family UPF0038 signature 

The following uncharacterized proteins have been shown [1] to share regions of similarities: - 
Escherichia coli hypothetical protein yacE and HI0890, the corresponding Haemophilus 

25 influenzae protein. - Mycobacterium tuberculosis hypothetical protein MtCY01B2.23 and 
O410, the corresponding Mycobacterium leprae protein. - Synechocystis strain PCC 6803 
hypothetical protein slr0553. - Other hypothetical proteins from Aeromonas hydrophila, 
Bacteroides nodosus, Neisseria gonorrhoeae, Pseudomonas putida, Thermus thermophilus 
and Xanthomonas campestris. - Human hypothetical protein pOV-2. - Yeast hypothetical 

3 0 protein YDR196C. - Caenorhabditis elegans hypothetical protein T05G5.5.These proteins all 
contain, in their N-terminal extremity, an ATP/GTP -binding motif 'A' (P-loop) (see 
<PDOC00017>)- The size of these proteins range from 200 to 290 residues (with the 
exception of the Mycobacterial sequences which are are 410 residues long). A conseved 
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region some 50 residues away from the ATP -binding P-loop has been developed as a 
signature pattern. 

Consensus pattern: G-x-[LI]-x-R-x(2)-L-x(4)-F-x(8)-[LIV]-x(5)-P-x-[LIV] - 
[ 1] Rudd K.E., Bairoch A. Unpublished observations (1997). 

5 

704. Ubiqui tin-conjugating enzymes active site 

Ubiquitin-conjugating enzymes (UBC or E2 enzymes) [1,2,3] catalyze the covalent 
attachment of ubiquitin to target proteins. An activatedubiquitin moiety is transferred from an 

1 0 ubiquitin-activating enzyme (El) to E2which later ligates ubiquitin directly to substrate 

proteins with or without the assistance of 'N-end' recognizing proteins (E3). In most species 
there are many forms of UBC (at least 9 in yeast) which are implicated in diverse cellular 
functions. A cysteine residue is required for ubiquitin-thiolester formation. There is a single 
conserved cysteine in UBCs and the region around that residue isconserved in the sequence 

15 of known UBC isozymes. That region has been used as a signature pattern. 

Consensus pattern: [FYWLSP]-H-[PC]-[NH]-[LIV]-x(3,4)-G-x-[LIV]-C-[LIV]-x- [LIV] [C 
is the active site residue] 

[ 1] Jentsch S., Seufert W., Sommer T., Reins H.-A. Trends Biochem. Sci. 15:195- 
198(1990).[ 2] Jentsch S., Seufert W., Hauser H.-P. Biochim. Biophys. Acta 1089:127- 
20 139(1991).[ 3] Hershko A. Trends Biochem. Sci. 16:265-268(1991). 

705. Uroporphyrinogen decarboxylase signatures 

Uroporphyrinogen decarboxylase (URO-D), the fifth enzyme of the heme biosynthetic 

2 5 pathway, catalyzes the sequential decarboxylation of the four acetyl side chains of 

uroporphyrinogen to yield coproporphyrinogen [1]. URO-D deficiency is responsible for the 
Human genetic diseases familialporphyria cutanea tarda (fPCT) and hepatoerythropoietic 
porphyria (HEP).The sequence of URO-D has been well conserved throughout evolution. 
The best conserved region is located in the N-terminal section; it contains a 

3 0 perfectlyconserved hexapeptide. There are two arginine residues in this hexapeptide which 

could be involved in the binding, via salt bridges, to the carboxylgroups of the propionate 
side chains of the substrate. This region has been used as a signature pattern. A second 
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signature pattern is based on a another well conserved region which is located in the central 
section of the protein. 

Consensus pattern: P-x-W-x-M-R-Q-A-G-R 

Consensus pattern: G-F-[STAGCV]-[STAGC]-x-P-[FYW]-T-[LV]-x(2)-Y-x(2)-[AE]- [GK] 
[ 1] Garey J.R., Labbe-Bois R., Chelstowska A., Rytka J., Harrison L., Kushner J., Labbe P. 
Eur. J. Biochem. 205:1011-1016(1992). 

706. ubiE/C0Q5 methyltransferase family signatures 

The following methyltransferases have been shown [1] to share regions of similarities: - 
Escherichia coli ubiE, which is involved in both ubiquinone and menaquinone biosynthesis 
and which catalyzes the S-adenosylmethionine dependent methylation of 2-polyprenyl-6- 
methoxy-l,4-benzoquinol into 2-polyprenyl-3- methyl-6-methoxy-l,4-benzoquinol and of 
demethylmenaquinol into menaquinol. - Yeast COQ5, a ubiquinone biosynthesis 
methlytransferase. - Bacillus subtilis spore germination protein C2 (gene: gercB or gerC2), a 
probable menaquinone biosynthesis methlytransferase. - Lactococcus lactis gerC2 homolog. - 
Caenorhabditis elegans hypothetical protein ZK652.9. - Leishmania donovani amastigote- 
specific protein A41. These are hydrophilic proteins of about 30 Kd (except for ZK652.9 
which is 65Kd). They can be picked up in the database by the following patterns. 
Consensus pattern: Y-D-x-M-N-x(2)-[LIVM]-S-x(3)-H-x(2)-W 
Consensus pattern: R-V-[LIVM]-K-[PV]-G-G-x-[LIVMF]-x(2)-[LIVM]-E-x-S 
[ 1] Lee P.T., Hsu A.Y., Ha H.T., Clarke C.F. J. Bacteriol. 179:1748-1754(1997). 

707. Uricase signature 

Uricase (urate oxidase) [1] is the peroxisomal enzyme responsible for the degradation of 
urate into allantoin. Some species, like primates and birds, have lost the gene for uricase and 
are therefore unable to degradeurate. Uricase is a protein of 300 to 400 amino acids. A highly 
conserved region located in the central part of the sequence has been used as a signature 
pattern. 

Consensus pattern: [LV]-x-[LV]-[LIV]-K-[STV]-[ST]-x-[SN]-x-F-x(2)-[FY]-x(4)- [FY]- 
x(2)-L-x(5)-R 

[ 1] Motojima K., Kanaya S., Goto S. J. Biol. Chem. 263:16677-16681(1988). 
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708. Universal stress protein family (Usp) 

By a wide range of stress conditions members of the Usp family are predicted to be 
5 related to the MADS-box proteins transcript_ fact and bind to DNA [2]. Number of members: 
39 

[1] Expression and role of the universal stress protein, UspA, of Escherichia coli during 
growth arrest. Nystrom T, Neidhardt FC; Mol Microbiol 1994; 11:537-544. 
10 [2] Sequence analysis of eukaryotic developmental proteins: ancient and novel domains. 
Mushegian AR, Koonin EV; Genetics 1996; 144:817-828. 



709. Ubiquitin domain signature and profile 

15 Ubiquitin [1,2,3] is a protein of seventy six amino acid residues, found in all eukaryotic cells 
and whose sequence is extremely well conserved from protozoan to vertebrates. It plays a key 
role in a variety of cellular processes, such as ATP-dependent selective degradation of 
cellular proteins,maintenance of chromatin structure, regulation of gene expression, stress 
response and ribosome biogenesis. In most species, there are many genes coding for 

2 0 ubiquitin. However they can be classified into two classes. The first class produces 

polyubiquitin molecules consisting of exact head to tail repeats of ubiquitin. The number of 
repeats is variable (up to twelve in a Xenopus gene). In the majority of polyubiquitin 
precursors, there is a final amino-acid after the last repeat. The second class of genes 
produces precursor proteins consisting of a single copy of ubiquitin fused to a C-terminal 

2 5 extension protein (CEP). There are two types of CEP proteins and both seem to be ribosomal 

proteins. Ubiquitin is a globular protein, the last four C-terminal residues (Leu-Arg- Gly-Gly) 
extending from the compact structure to form a 'tail', important for its function. The latter is 
mediated by the covalent conjugation of ubiquitin to target proteins, by an isopeptide linkage 
between the C-terminal glycine and the epsilon amino group of lysine residues in the target 

3 0 proteins. There are a number of proteins which are evolutionary related to ubiquitin: - 

Ubiquitin-like proteins from baculoviruses as well as in some strains of bovine viral diarrhea 
viruses (BVDV). These proteins are highly similar to their eukaryotic counterparts. - 
Mammalian protein GDX [4]. GDX is composed of two domains, a N-terminal ubiquitin-like 
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domain of 74 residues and a C-terminal domain of 83 residues with some similarity with the 
thyroglobulin hormonogenic site. - Mammalian protein FAU [5]. FAU is a fusion protein 
which consist of a N-terminal ubiquitin-like protein of 74 residues fused to ribosomal protein 
S30. - Mouse protein NEDD-8 [6], a ubiquitin-like protein of 81 residues. - Human protein 
5 BATS, a large fusion protein of 1132 residues that contains a N-terminal ubiquitin-like 

domain. - Caenorhabditis elegans protein ubl-1 [7]. Ubl-1 is a fusion protein which consist of 
a N-terminal ubiquitin-like protein of 70 residues fused to ribosomal protein S27A. - Yeast 
DNA repair protein RAD23 [8]. RAD23 contains a N-terminal domain that seems to be 
distantly, yet significantly, related to ubiquitin. - Mammalian RAD23-related proteins 

1 0 RAD23A and RAD23B. - Mammalian BCL-2 binding athanogene-1 (BAG-1). BAG-1 is a 
protein of 274 residues that contains a central ubiquitin-like domain. - Human spliceosome 
associated protein 114 (SAP 114 or SF3A120). - Yeast protein DSK2, a protein involved in 
spindle pole body duplication and which contains a N-terminal ubiquitin-like domain. - 
Human protein CKAPl/TFCB, Schizosaccharomyces pombe protein alpll and 

1 5 Caenorhabditis elegans hypothetical protein F53F4.3. These proteins contain a N-terminal 
ubiquitin domain and a C-terminal CAP-Gly domain. - Schizosaccharomyces pombe 
hypothetical protein SpAC26A3.16. This protein contains a N-terminal ubiquitin domain. - 
Yeast protein SMT3. - Human ubiquitin-like proteins SMT3A and SMT3B. - Human 
ubiquitin-like protein SMT3C (also known as PICl; Ubll, Sumo-1; Gmp-1 or Sentrin). This 

20 protein is involved in targeting ranGAPl to the nuclear pore complex protein ranBP2. - 

SMT3-like proteins in plants and Caenorhabditis elegans. To identify ubiquitin and related 
proteins, a pattern has been developed based on conserved positions in the central section of 
the sequence. A profile was also developed that spans the complete length of the ubiquitin 
domain. 

25 Consensus pattern: K-x(2)-[LIVM]-x-[DESAK]-x(3)-[LIVM]-[PA]-x(3)-Q-x-[LIVM]- 
[LIVMC]-[LIVMFY]-x-G-x(4)-[DE] 

[ 1] Jentsch S., Seufert W., Hauser H.-P. Biochim. Biophys. Acta 1089:127-139(1991).[ 2] 
Monia B.P., Ecker D.J., Croke S.T. Bio/Technology 8:209-215(1990).[ 3] Finley D., 
Varshavsky A. Trends Biochem. Sci. 10:343-347(1985).[ 4] Filippi M., Tribioli C, Toniolo 
30 D. Genomics 7:453-457(1990).[ 5] Olvera J., Wool I.G. J. Biol. Chem. 268: 17967- 

17974(1993).[ 6] Kumar S., Yoshida Y., Noda M. Biochem. Biophys. Res. Commun. 
195:393-399(1993).[ 7] Jones D., Candido E.P. J. Biol. Chem. 268:19545-19551(1993).[ 8] 
Melnick L., Sherman F. J. Mol. Biol. 233:372-388(1993). 
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710. VHS domain 

Domain present in VPS -27. Hrs and STAM. Number of members: 27 

711. Vinculin family signatures 

Vinculin [1] is a eukaryotic protein that seems to be involved in the attachment of the actin- 
based microfilaments to the plasma membrane. Vinculinis located at the cytoplasmic side of 
focal contacts or adhesion plaques. In addition to actin, vinculin interacts with other structural 
proteins such as talin and alpha-actinins. Vinculin is a large protein of 116 Kd (about a 1000 
residues). Structurally the protein consists of an acidic N-terminal domain of about 90 Kd 
separated from a basic C-terminal domain of about 25 Kd by a proline-rich region of about 50 
residues. The central part of the N-terminal domain consists of avariable number (3 in 
vertebrates, 2 in Caenorhabditis elegans) of repeats of a 110 amino acids domain. Catenins 
[2] are proteins that associate with the cytoplasmic domain of avariety of cadherins. The 
association of catenins to cadherins produces a complex which is linked to the actin filament 
network, and which seems to be of primary importance for cadherins cell-adhesion 
properties. Three different types of catenins seem to exist: alpha, beta, and gamma. Alpha- 
catenins are proteins of about 100 Kd which are evolutionary related to vinculin. Interm of 
their structure the most significant differences are the absence, inalpha-catenin, of the 
repeated domain and of the proline-rich segment. Two signature patterns for this family of 
proteins have been devolped. The first pattern is located in the N-terminal section of both 
vinculin and alpha-catenins and is part, in vinculin, of a domain that seems to be involved 
with the interaction with talin. The second pattern is based on a conserved regionin the N- 
terminal part of the repeated domain of vinculin. 

Consensus pattern: [KR]-x-[LIVMF]-x(3)-[LIVMA]-x(2)-[LIVM]-x(6)-R-Q-Q-E-L 

Consensus pattern: [LIVM]-x-[QA]-A-x(2)-W-[IL]-x-[DN]-P 

[ 1] Otto J.J. Cell Motil. Cytoskeleton 16:1-6(1990).[ 2] Herrenknecht K., Ozawa M., 

Eckerskorn C, Lottspeich F., Lenter M., Kemler R. Proc. Natl. Acad. Sci. U.S.A. 88:9156- 

9160(1991). 
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712. (Vitellogenin N) Lipoprotein amino terminal region 

This family contains regions from: Vitellogenin, Microsomal triglyceride transfer 
protein and apolipoprotein B-100. These proteins are all involved in lipid transport [1]. This 
family contains the LVln chain from lipovitellin, that contains two structural domains. 
Number of members: 33 

[1] The structural basis of lipid interactions in lipovitellin, a soluble lipoprotein. 
Anderson TA, Levitt DG, Banaszak U Structure 1998;6:895-909. 

713. (VMS A) Major surface antigen from hepadnavirus 

714. ssDNA binding protein (Viral DNA bp) 

This protein is found in herpesviruses and is needed for 
replication. 

715. (Votage CLC) Voltage gated chloride channels 

This family of ion channels contains 10 or 12 transmembrane helices. Each protein forms a 
single pore. It has been shown that some members of this family form homodimers. These 
proteins contain two CBS domains. 

[1] Schmidt-Rose T, Jentsch TJ; J Biol Chem 1997;272:20515-20521. 

[2] Zhang J, George AL Jr, Griggs RC, Fouad GT, Roberts J, Kwiecinski H, Connolly AM, 

Ptacek U; Neurology 1996;47:993-998. 

716. von Willebrand factor type A domain (vwa) 
More von Willebrand factor type A domains? Sequence 
similarities with malaria thrombospondin-related 
anonymous protein, dihydropyridine-sensitive calcium 
channel and inter-alpha-trypsin inhibitor. 
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Bork P, Rohde K; 

Biochem J 1991;279:908-911. 

1. RUGGERI, Z.M. and WARE, J. 
von Willebrand factor. 
FASEB J. 7 308-316 (1993). 

2. COLOMBATTI, A., BONALDO, P. and DOLIANA, R. 

Type A modules: interacting domains found in several non-fibrillar 
collagens and in other extracellular matrix proteins. 
MATRIX 13 297-306 (1993). 

3. PERKINS, S.J., SMITH, K.F., WILLIAMS, S.C., HARIS, P.I., CHAPMAN, D. 
and SIM, R.B. 

The secondary structure of the von Willebrand factor type A domain in 
factor B of human complement by Fourier transform infrared spectroscopy. 
Its occurrence in collagen types VI, VII, XII and XIV, the integrins and 
other proteins by averaged structure predictions. 
J.MOL.BIOL. 238 104-119 (1994). 

4. BORK, P. and ROHDE, K. 

More von Willebrand factor type A domains? Sequence similarities with 
malaria thrombospondin -related anonymous protein, dihydropyridine- 
sensitive calcium channel and inter-alpha-trypsin inhibitor. 
BIOCHEM.J. 279 908-910 (1991). 

5. EDWARDS, Y.J.K. and PERKINS, S.J. 

The protein fold of the von Willebrand factor type A domain is predicted 
to be similar to the open twisted beta-sheet flanked by alpha-helices 
found in human ras-p21. 
FEBS LETT. 358 283-286 (1995). 



6. LEE, J.O., RIEU, P., ARNAOUT, M.A. and LIDDINGTON, R. 
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Crystal structure of the A domain from the alpha subunit of integrin CR3 

(CDllb/CD18). 

CELL 80 631-638 (1995). 

7. QU, A. and LEAHY, D.J. 

Crystal structure of the I-domain from the CDlla/CD18 (LFA-1, 
alpha L beta 2) integrin. 

PROC.NATL.ACAD.SCI.USA 92 10277-10281 (1995). 

The von Willebrand factor is a large multimeric glycoprotein found in blood 
plasma. Mutant forms are involved in the aetiology of bleeding disorders 
[1]. In von Willebrand factor, the type A domain (vWF) is the prototype for 
a protein superfamily. The vWF domain is found in various plasma proteins: 
complement factors B, C2, CR3 and CR4; the integrins (I-domains); collagen 
types VI, VII, XII and XIV; and other extracellular proteins [2-4]. Proteins 
that incorporate vWF domains participate in numerous biological events 
(e.g., cell adhesion, migration, homing, pattern formation, and signal 
transduction), involving interaction with a large array of ligands [2]. 
Secondary structure prediction from 75 aligned vWF sequences has revealed 
a largely alternating sequence of alpha-helices and beta-strands [3]. Fold 
recognition algorithms were used to score sequence compatibility with a 
library of known structures: the vWF domain fold was predicted to be a 
doubly-wound, open, twisted beta-sheet flanked by alpha-helices [5]. 
3D structures have been determined for the I-domains of integrins CDllb 
(with bound magnesium) [6] and CDlla (with bound manganese) [7]. The domain 
adopts a classic alpha/beta Rossmann fold and contains an unusual metal 
ion coordination site at its surface. It has been suggested that this site 
represents a general metal ion-dependent adhesion site (MIDAS) for binding 
protein ligands [6]. The residues constituting the MIDAS motif in the CDllb 
and CDlla I-domains are completely conserved, but the manner in which the 
metal ion is coordinated differs slightly [7]. 

VWFADOMAIN is a 3-element fingerprint that provides a signature for the vWF 
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domain superfamily. The fingerprint was derived from an initial alignment 
of 14 sequences. Motif 1 includes the first beta-strand and 3 conserved 
residues involved in metal ion coordination in I-domains (Asp and 2 serines 
in positions 8, 10 and 12, respectively); motif 2 spans strands beta-2 and 
5 beta-2'; and motif 3 encodes beta-strand 3 and a conserved Asp (in position 
7), which coordinates the metal ion [6,7]. Three iterations on OWL27.0 were 
required to reach convergence, at which point a true set comprising 56 
sequences was identified. Numerous partial matches were also found. 

10 

717. (WD40) WD domain, G-beta repeat 

The ancient regulatory-protein family of WD-repeat proteins. 

Neer EJ, Schmidt CJ, Nambudripad R, Smith TF; 

Nature 1994;371:297-300. 

1 5 Beta-transducin (G-beta) is one of the three subunits (alpha, beta, and gamma) 
of the guanine nucleotide-binding proteins (G proteins) which act as 
intermediaries in the transduction of signals generated by transmembrane 
receptors [1]. The alpha subunit binds to and hydrolyzes GTP; the functions of 
the beta and gamma subunits are less clear but they seem to be required for 

2 0 the replacement of GDP by GTP as well as for membrane anchoring and 
receptor recognition. 

In higher eukaryotes G-beta exists as a small multigene family of highly 
conserved proteins of about 340 amino acid residues. Structurally G-beta 

2 5 consists of eight tandem repeats of about 40 residues, each containing a 

central Trp-Asp motif (this type of repeat is sometimes called a WD-40 
repeat). Such a repetitive segment has been shown [El,2,3,4,5] to exist in a 
number of other proteins listed below: 

3 0 - Yeast STE4, a component of the pheromone response pathway. STE4 is a G-beta 

like protein that associates with GPAl (G-alpha) and STE18 (G-gamma). 
- Yeast MS II, a negative regulator of RAS-mediated cAMP synthesis. MSIl is 
most probably also a G-beta protein. 
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- Human and chicken protein 12.3. The function of this protein is not known, 
but on the basis of its similarity to G-beta proteins, it may also function 

in signal transduction. 

- Chlamydomonas reinhardtii gblp. This protein is most probably the homolog 
of vertebrate protein 12.3. 

- Human LISl, a neuronal protein involved in type-1 lissencephaly [E2]. 

- Mammalian coatomer beta' subunit (beta'-COP), a component of a cytosolic 
protein complex that reversibly associates with Golgi membranes to form 
vesicles that mediate biosynthetic protein transport. 

- Yeast CDC4, essential for initiation of DNA replication and separation of 
the spindle pole bodies to form the poles of the mitotic spindle. 

- Yeast CDC20, a protein required for two microtubule-dependent processes: 
nuclear movements prior to anaphase and chromosome separation. 

- Yeast MAKll, essential for cell growth and for the replication of Ml 
double-stranded RNA. 

- Yeast PRP4, a component of the U4/U6 small nuclear ribonucleoprotein with 
a probable role in mRNA splicing. 

- Yeast PWPl, a protein of unknown function. 

- Yeast SKIS, a protein essential for controlling the propagation of double- 
stranded RNA. 

- Yeast SOFl, a protein required for ribosomal RNA processing which 
associates with U3 small nucleolar RNA. 

- Yeast TUPl (also known as AER2 or SFL2 or CYC9), a protein which has been 
implicated in dTMP uptake, catabolite repression, mating sterility, and 

many other phenotypes. 

- Yeast YCR57c, an ORF of unknown function from chromosome III. 

- Yeast YCR72c, an ORF of unknown function from chromosome III. 

- Slime mold coronin, an actin-binding protein. 

- Slime mold AAC3, a developmentally regulated protein of unknown function. 
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- Drosophila protein Groucho (formerly known as E(spl); 'enhancer of split'), 
a protein involved in neurogenesis and that seems to interact with the 
Notch and Delta proteins. 

- Drosophila TAF-II-80, a protein that is tightly associated with TFIID. 

The number of repeats in the above proteins varies between 5 (PRP4, TUPl, and 
Groucho) and 8 (G-beta, STE4, MSU, AAC3, CDC4, PWPl, etc.). In G-beta and G- 
beta like proteins, the repeats span the entire length of the sequence, while 
in other proteins, they make up the N-terminal, the central or the C-terminal 
section. 

A signature pattern can be developed from the central core of the domain 
(positions 9 to 23). 

-Consensus pattern: [LIVMSTAC]-[LIVMFYWSTAGC]-[LIMSTAG]-[LIVMSTAGC]-x(2)- 
[DN]- 

x(2)-[LIVMWSTAC]-x-[LIVMFSTAG]-W-[DEN]-[LIVMFSTAGCN] 

[ 1] Gilman A.G. 

Annu. Rev. Biochem. 56:615-649(1987). 
[ 2] Duronio R.J., Gordon J. I., Boguski M.S. 

Proteins 13:41-56(1992). 
[ 3] van der Voorn L., Ploegh H.L. 

FEBS Lett. 307:131-134(1992). 
[ 4] Neer E.J., Schmidt C.J., Nambudripad R., Smith T.F. 

Nature 371:297-300(1994). 
[ 5] Smith T.F., Gaiatzes C.G., Saxena K., Neer E.J. 

Biochemistry In Press(1998). 

718. WHEP-TRS domain containing proteins 

A conserved domain of 46 amino acids has been shown [1] to exist in a number 
of higher eukaryote aminoacyl-transfer RNA synthetases. This domain is present 
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one to six times in the following enzymes: 

- Mammalian multifunctional aminoacyl-tRNA synthetase. The domain is present 
three times in a region that separates the N-terminal glutamyl-tRNA 
synthetase domain from the C-terminal prolyl-tRNA synthetase domain. 

- Drosophila multifunctional aminoacyl-tRNA synthetase. The domain is present 
six times in the intercatalytic region. 

- Mammalian tryptophanyl-tRNA synthetase. The domain is found at the N- 
terminal extremity. 

- Mammalian, insect, nematode and plant glycyl-tRNA synthetase. The domain is 
found at the N-terminal extremity [2]. 

- Mammalian histidyl-tRNA synthetase. The domain is found at the N-terminal 
extremity. 

This domain, which is called WHEP-TRS, could contain a central alpha-helical 
region and may play a role in the association of tRNA-synthetases into 
multienzyme complexes. 

A signature pattern based on the first 29 positions of the WHEP- 
Domain has been developed. 

-Consensus pattern: [QY]-G-[DNEA]-x-[LIV]-[KR]-x(2)-K-x(2)-[KRNG]-[AS]-x(4)- 
[LIV]-[DENK]-x(2)-[IV]-x(2)-L-x(3)-K 

[ 1] Cerini C, Kerjan P., Astier M., Gratecos D., Mirande M., Semeriva M. 

EMBO J. 10:4267-4277(1991). 
[ 2] Nada S., Chang P.K., Dignam J.D. 

J. Biol. Chem. 268:7660-7667(1993). 

719. (Worm family 8) Putative membrane protein 

Analysis of protein domain families in Caenorhabditis elegans. 

Sonnhammer EL, Durbin R; 



Reference No. 2750-942P 



570 

Genomics 1997;46:200-216. 

This family called family 8 in [1], may be a transmembrane protein 
The specific function of this protein is unknown. 

720. Xylose isomerase 

Xylose isomerase (EC 5.3.1.5) [1] is an enzyme found in microorganisms which 
catalyzes the interconversion of D-xylose to D-xylulose. It can also isomerize 
D-ribose to D-ribulose and D-glucose to D-fructose. Xylose isomerase seems to 
require magnesium for its activity, while cobalt is necessary to stabilize the 
tetrameric structure of the enzyme. A number of residues are conserved in all 
known xylose isomerases. 

Xylose isomerase also exists in plants [2] where it is homodimeric and is 
manganese-dependent. 

Two signatures patterns for xylose isomerase have been developed. The first one 
derived from a stretch of five conserved amino acids that includes a glutamic 
acid residue known to be one of the four residues involved in the binding of 
the magnesium ion [3]; this pattern also includes a lysine residue which is 
involved in the catalytic activity. The second pattern is derived from a 
conserved region in the N-terminal section of the enzyme that include an 
histidine residue which has been shown [4] to be involved in the catalytic 
mechanism of the enzyme. 

-Consensus pattern: [LI]-E-P-K-P-x(2)-P 

[E is a magnesium ligand] 

[K is an active site residue] 
-Consensus pattern: [FL]-H-D-x-D-[LIV]-x-[PD]-x-[GDE] 

[H is an active site residue] 

[ 1] Dauter Z., Dauter M., Hemker J., Witzel H., Wilson K.S. 
FEES Lett. 247:1-8(1989). 



Reference No. 



2750-942P 



571 

[ 2] Kristo P.A., Saarelainen R., Fagerstrom R., Aho S., Korhola M. 

Eur. J. Biochem. 237:240-246(1996). 
[ 3] Henrick K., Collyer C.A., Blow D.M. 

J. Mol. Biol. 208:129-157(1989). 
[ 4] Vangrysperre W., Ampe C, Kersters-Hilderson H., Tempst P. 

Biochem. J. 263:195-199(1989). 

721. XPG protein signatures. Xeroderma pigmentosum (XP) [1] is a human autosomal 
recessive disease, characterized by a high incidence of sunlight-induced skin cancer. People's 
skin cells with this condition are hypersensitive to ultraviolet light, due to defects in the 
incision step of DNA excision repair. There are a minimum of seven genetic 
complementation groups involved in this pathway: XP-A to XP-G. The defect in XP-G can 
be corrected by a 133 Kd nuclear protein called XPG (or XPGC) [2]. XPG belongs to a family 
of proteins [2,3,4,5,6] that are composed of twomain subsets: - Subset 1, to which belongs 
XPG, RAD2 from budding yeast and radl3 from fission yeast. RAD2 and XPG are single- 
stranded DNA endonucleases [7,8]. XPG makes the 3'incision in human DNA nucleotide 
excision repair [9]. - Subset 2, to which belongs mouse and human FEN-1, rad2 from fission 
yeast, and RAD27 from budding yeast. FEN-1 is a structure-specific endonuclease. In 
addition to the proteins listed in the above groups, this family also includes: - Fission yeast 
exol, a 5'->3' double-stranded DNA exonuclease that could act in a pathway that corrects 
mismatched base pairs. - Yeast EXOl (DHSl), a protein with probably the same function as 
exol. - Yeast DIN7.Sequence alignment of this family of proteins reveals that similarities are 
largely confined to two regions. The first is located at the N-terminal extremity (N-region) 
and corresponds to the first 95 to 105 amino acids. The second region is internal (I-region) 
and found towards the C-terminus; it spans about 140 residues and contains a highly 
conserved core of 27 amino acids that includes a conserved pentapeptide (E-A-[DE]-A-[QS]). 
It is possible that the conserved acidic residues are involved in the catalytic mechanism of 
DNA excision repair in XPG. The amino acids linking the N- and I-regions are not 
conserved; indeed, they are largely absent from proteins belonging to the second subset. Two 
signature patterns have been developed for these proteins. The first corresponds to the central 
part of the N-region, the second to part of the I-region and includes the putative catalytic core 
pentapeptide 
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Consensus pattern: [VI]-[KRE]-P-x-[FYIL]-V-F-D-G-x(2)-[PIL]-x-[LVC]-K- 
Consensus pattern: [GS]-[LIVMJ-[PER]-[FYS]-[LIVM]-x-A-P-x-E-A-[DE]-[PAS]- [QS]- 
[CLM]- 

[ 1] Tanaka K., Wood R.D. Trends Biochem. Sci. 19:83-86(1994). [ 2] Scherly D., Nouspikel 
T., Corlet J., Ucla C, Bairoch A., Clarkson S.G. Nature 363:182-185(1993).[ 3] Carr A.M., 
Sheldrick K.S., Murray J.M., Al-Harithy R., Watts F.Z., Lehmann A.R. Nucleic Acids Res. 
21:1345-1349(1993).[ 4] Murray J.M., Tavassoli M., Al-Harithy R., Sheldrick K.S., 
Lehmann A.R., Carr A.M., Watts F.Z. Mol. Cell. Biol. 14:4878-4888(1994).[ 5] Harrington 
J.J., Lieber M.R. Genes Dev. 8:1344-1355(1994). [ 6] Szankasi P., Smith G.R. Science 
267:1166-1169(1995).[ 7] Habraken Y., Sung P., Prakash L., Prakash S. Nature 366:365- 
368(1993).[ 8] O'Donovan A., Scherly D., Clarkson S.G., Wood R.D. J. Biol. Chem. 
269:15965-15968(1994).[ 9] O'Donovan A., Davies A.A., Moggs J.G., West S.C., Wood 
R.D. Nature 371:432-435(1994). 

722. Xanthine/uracil permeases family 

The follovv^ing transport proteins which are involved in the uptake of xanthine 
or uracil are evolutionary related [1]: 

- Uric uric acid-xanthine permease (gene uapA) from Aspergillus nidulans. 

- Purine permease (gene uapC) from Aspergillus nidulans. 

- Xanthine permease from Bacillus subtilis (gene pbuX). 

- Uracil permease from Escherichia coli (gene uraA) [2] and Bacillus (gene 
pyrP). 

- Hypothetical protein ycdG from Escherichia coli. 

- Hypothetical protein ygfO from Escherichia coli. 

- Hypothetical protein ygfU from Escherichia coli. 

- Hypothetical protein yicE from Escherichia coli. 

- Hypothetical protein yunJ from Bacillus subtilis. 

- Hypothetical protein yunK from Bacillus subtilis. 
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They are proteins of from 430 to 595 residues that seem to contain 12 
transmembrane domains. 

The best conserved region which corresponds with what seems to 

be the tenth transmembrane domain has been selected as a signature pattern. 

-Consensus pattern: [LIVM]-P-x-[PASIF]-V-[LIVM]-G-G-x(4)-[LIVM]-[FY]-[GSA]- 
[LIVM]-x(3)-G 

[ 1] Diallinas G., Gorfinkiel L., Arst G., Cecchetto G., Scazzocchio C. 

J. Biol. Chem. 270:8610-8622(1995). 
[ 2] Andersen P.S., Frees D., Fast R., Mygind B. 

J. Bacteriol. 177:2008-2013(1995). 

723. Hypothetical yabO/yceC/sfhB family 

The following proteins, which seems to belong to a family of pseudouridine 
synthases (EC 4.2.1.70) [1] have been shown to share regions of similarities: 

- Escherichia coli and Haemophilus influenzae ribosomal large subunit 
pseudouridine synthase A (gene rluA). It is responsible for synthesis of 
pseudouridine from uracil-746 IN 23S rRNA. 

- Escherichia coli and Haemophilus influenzae ribosomal large subunit 
pseudouridine synthase C (gene rluC). It is responsible for synthesis of 
pseudouridine from uracil at positions 955, 2504 and 2580 in 23S rRNA. 

- Escherichia coli protein and homologs in other bacteria large subunit 
pseudouridine synthase D (gene rluD). 

- Yeast DRAP deaminase (gene RIB2). 

- Escherichia coli hypothetical protein yqcB and HI1435, the corresponding 
Haemophilus influenzae protein. 

- Haemophilus influenzae hypothetical protein HI0042. 

- Aquifex aeolicus hypothetical protein AQ_1758. 

- Bacillus subtilis hypothetical protein yhcT. 

- Bacillus subtilis hypothetical protein yjbO. 

- Bacillus subtilis hypothetical protein ylyB. 



Reference No. 



2750-942P 



574 

- Helicobacter pylori hypothetical protein HP0347. 

- Helicobacter pylori hypothetical protein HP0745. 

- Helicobacter pylori hypothetical protein HP0956. 

- Mycoplasma genitalium hypothetical protein MG209. 

- Mycoplasma genitalium hypothetical protein MG370. 

- Synechocystis strain FCC 6803 hypothetical protein slrl592. 

- Synechocystis strain PCC 6803 hypothetical protein slrl629. 

- Yeast hypothetical protein YDL036c. 

- Yeast hypothetical protein YGR169c. 

- Fission yeast hypothetical protein SpAClSBl 1.02c. 

- Caenorhabditis elegans hypothetical protein K07E8.7. 

These are proteins of from 21 to 50 Kd which contain a number of conserved 
regions in their central section. They can be picked up in the database by the 
following highly conserved pattern. 

-Consensus pattern: [LIVCA]-[NHYT]-R-[LI]-D-x(2)-T-[STA]-G-[LIVAGC]- 
[LIVMF](2)-[LIVMFGC]-[SGTACV] 

[ 1] Conrad J., Sun D., Englund N., Ofengand J. 
J. Biol. Chem. 273:18562-18566(1998). 

In addition, the following bacterial proteins, which seems to belong to a family 
pseudouridine synthases (EC 4.2.1.70) [1] also have been shown to share regions 
similarities: 

- Escherichia coli and Haemophilus influenzae 16S pseudouridylate 516 
synthase (EC 4.2.1.70) (gene: rsuA). This enzyme is responsible for the 
formation of pseudouridine from uracil-516 in 16S ribosomal RNA. 

-Escherichia coli hypothetical protein yciL and HIl 199, the corresponding 
Haemophilus influenzae protein. 

- Escherichia coli hypothetical protein yjbC. 

- Escherichia coli hypothetical protein ymfC and HI0694, the corresponding 
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Haemophilus influenzae protein. 

- Aquifex aeolicus hypothetical protein AQ_554. 

- Aquifex aeolicus hypothetical protein AQ_1464. 

- Bacillus subtilis hypothetical protein ypuL. 

- Bacillus subtilis hypothetical protein ytzF. 

- Borrelia burgdorferi hypothetical protein BB0129. 

- Helicobacter pylori hypothetical protein HP1459. 

- Synechocystis strain PCC 6803 hypothetical protein slr0361. 

- Synechocystis strain PCC 6803 hypothetical protein slr0612. 

These are proteins of from 25 to 40 Kd which contain a number of conserved 
regions in their central section. They can be picked up in the database by the 
following highly conserved pattern. 

-Consensus pattern: G-R-L-D-x(2)-[STA]-x-G-[LIVFA]-[LIVMF](3)-[ST]-[DNST] 

[ 1] Wrzesinski J., Bakin A., Nurse K., Lane B.G., Ofengand J. 
Biochemistry 34:8904-8913(1995). 

724. Zinc finger present in dystrophin, CBP/p300 
ZZ in dystrophin binds calmodulin 

Putative zinc finger; binding not yet shown. 

725. Zinc carboxy peptidase 

There are a number of different types of zinc-dependent carboxypeptidases (EC 
3.4.17.-) [1,2]. All these enzymes seem to be structurally and functionally 
related. The enzymes that belong to this family are listed below. 

- Carboxypeptidase Al (EC 3.4.17.1), a pancreatic digestive enzyme that can 
removes all C-terminal amino acids with the exception of Arg, Lys and Pro. 

- Carboxypeptidase A2 (EC 3.4.17.15), a pancreatic digestive enzyme with a 
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specificity similar to that of carboxypeptidase Al, but with a preference 
for bulkier C-terminal residues. 

- Carboxypeptidase B (EC 3.4.17.2), also a pancreatic digestive enzyme, but 
that preferentially removes C-terminal Arg and Lys. 

- Carboxypeptidase N (EC 3.4.17.3) (also known as arginine carboxypeptidase), 
a plasma enzyme which protects the body from potent vasoactive and 
inflammatory peptides containing C-terminal Arg or Lys (such as kinins or 
anaphylatoxins) which are released into the circulation. 

- Carboxypeptidase H (EC 3.4.17.10) (also known as enkephalin convertase or 
carboxypeptidase E), an enzyme located in secretory granules of pancreatic 
islets, adrenal gland, pituitary and brain. This enzyme removes residual C- 
terminal Arg or Lys remaining after initial endoprotease cleavage during 
prohormone processing. 

- Carboxypeptidase M (EC 3.4.17.12), a membrane bound Arg and Lys specific 
enzyme. 

It is ideally situated to act on peptide hormones at local tissue sites 
where it could control their activity before or after interaction with 
specific plasma membrane receptors. 

- Mast cell carboxypeptidase (EC 3.4.17.1), an enzyme with a specificity 
to carboxypeptidase A, but found in the secretory granules of mast cells. 

- Streptomyces griseus carboxypeptidase (Cpase SG) (EC 3.4.17.-) [3], which 
combines the specificities of mammalian carboxypeptidases A and B. 

- Thermoactinomyces vulgaris carboxypeptidase T (EC 3.4.17.18) (CPT) [4], 
which also combines the specificities of carboxypeptidases A and B. 

- AEBPl [5], a transcriptional repressor active in preadipocytes. AEBPl seems 
to regulate transcription by cleavage of other transcriptional proteins. 

- Yeast hypothetical protein YHR132c. 

All of these enzymes bind an atom of zinc. Three conserved residues are 
implicated in the binding of the zinc atom: two histidines and a glutamic acid 
Two signature patterns which contain these three zinc-ligands have been derived. 



-Consensus pattern: [PK]-x-[LIVMFY]-x-[LIVMFY]-x(4)-H-[STAG]-x-E-x-[LIVM]- 
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[STAG]-x(6)-[LIVMFYTA] 
[H and E are zinc ligands] 
-Consensus pattern: H-[STAG]-x(3)-[LIVME]-x(2)-[LIVMFYW]-P-[FYW] 
[H is a zinc ligand] 

[ 1] Tan F., Chan S.J., Steiner D.F., Schilling J.W., Skidgel R.A. 

J. Biol. Chem. 264:13165-13170(1989). 
[ 2] Reynolds D.S., Stevens R.L., Gurley D.S., Lane W.S., Austen K.F., 

Serafin W.E. 

J. Biol. Chem. 264:20094-20099(1989). 
[ 3] Narahashi Y. 

J. Biochem. 107:879-886(1990). 
[ 4] Teplyakov A., Polyakov K., Obmolova G., Strokopytov B., Kuranova I., 

Osterman A.L., Grishin N.V., Smulevitch S.V., Zagnitko O.P., 

Galperina O.V., Matz M.V., Stepanov V.M. 

Eur. J. Biochem. 208:281-288(1992). 
[ 5] He G.-P., Muise A., Li A.W., Ro H.-S. 

Nature 378:92-96(1995). 
[ 6] Hourdou M.-L., Guinand M., Vacheron M.J., Michel G., Denoroy L., 

Duez CM., Englebert S., Joris B., Weber G., Ghuysen J.-M. 

Biochem. J. 292:563-570(1993). 
[ 7] Rawlings N.D., Barrett A.J. 

Meth. Enzymol. 248:183-228(1995). 

726. Zinc finger, C2H2 type 

The C2H2 zinc finger is the classical zinc finger domain. 
The two conserved cysteines and histidines co-ordinate a 
zinc ion. The following pattern describes the zinc finger. 
#-X-C-X(l-5)-C-X3-#-X5-#-X2-H-X(3-6)-[H/C] 
Where X can be any amino acid, and numbers in brackets 
indicate the number of residues. The positions marked # are 
those that are important for the stable fold of the zinc 
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finger. The final position can be either his or cys. 
The C2H2 zinc finger is composed of two short beta strands 
followed by an alpha helix. The amino terminal part of the 
helix binds the major groove in DNA binding zinc fingers. 

'Zinc finger' domains [1-5] are nucleic acid-binding protein structures first 
identified in the Xenopus transcription factor TFIIIA. These domains have 
since been found in numerous nucleic acid-binding proteins. A zinc finger 
domain is composed of 25 to 30 amino-acid residues. There are two cysteine or 
histidine residues at both extremities of the domain, which are involved in 
the tetrahedral coordination of a zinc atom. It has been proposed that such a 
domain interacts with about five nucleotides. A schematic representation of a 
zinc finger domain is shown below: 

X X 

X X 

X X 

X X 

X X 

X X 

C H 

X \ / X 

X Zn X 

X / \ X 

C H 

XXXXX XXXXX 

Many classes of zinc fingers are characterized according to the number and 
positions of the histidine and cysteine residues involved in the zinc atom 
coordination. In the first class to be characterized, called C2H2, the first 
pair of zinc coordinating residues are cysteines, while the second pair are 
histidines. A number of experimental reports have demonstrated the zinc- 
dependent DNA or RNA binding property of some members of this class. 
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Some of the proteins known to include C2H2-type zinc fingers are listed below. 
The number of zinc finger regions found in each of these proteins are indicated 
between brackets; a '+' symbol indicates that only partial sequence 
5 data is available and that additional finger domains may be present. 

- Saccharomyces cerevisiae: ACE2 (3), ADRl (2), AZFl (4), FZFl (5), MIGl (2), 
MSN2 (2), MSN4 (2), RGMl (2), RIMl (3), RMEl (3), SFPl (2), SSLl (1), 
STPl (3), SWI5 (3), VACl (1) and ZMSl (2). 

1 0 - Emericella nidulans: brlA (2), creA (2). 

- Drosophila: AEF-1 (4), Cf2 (7), ci-D (5), Disconnected (2), Escargot (5), 
Glass (5), Hunchback (6), Kruppel (5), Kruppel-H (4+), Odd-skipped (4), 
Odd-paired (4), Pep (3), Snail (5), Spalt-major (7), Serependity locus beta 
(6), delta (7), h-1 (8), Suppressor of hairy wing su(Hw) (12), Suppressor 

15 of variegation suvar(3)7 (5), Teashirt (3) and Tramtrack (2). 

- Xenopus: transcription factor TFIIIA (9), p43 from RNP particle (9), Xfin 
(37 !!), Xsna (5), gastrula XlcGFS.l to XlcGF71.1 (from 4+ to 11+), Oocyte 
XlcOF2 to XlcOF22 (from 7 to 12). 

- Mammalian: basonuclin (6), BCL-6/LAZ-3 (6), erythroid krueppel-like 
2 0 transcription factor (3), transcription factors Spl (3), Sp2 (3), Sp3 (3) 

and Sp(4) 3, transcriptional repressor YYl (4), Wilms' tumor protein (4), 
EGRl/Krox24 (3), EGR2/Krox20 (3), EGR3/Pilot (3), EGR4/AT133 (4), Evi-1 
(10), GLIl (5), GLI2 (4+), GLI3 (3+), HIV-EP1/ZNF40 (4), HIV-EP2 (2), KRl 
(9+), KR2 (9), KR3 (15+), KR4 (14+), KR5 (11+), HR12 (6+), REX-1 (4), ZfX 

2 5 (13), ZfY (13), Zfp-35 (18), ZNF7 (15), ZNF8 (7), ZNF35 (10), ZNF42/MZF-1 

(13), ZNF43 (22), ZNF46/Kup (2), ZNF76 (7), ZNF91 (36), ZNF133 (3). 

In addition to the conserved zinc ligand residues it has been shown [6] that a 
number of other positions are also important for the structural integrity of 

3 0 the C2H2 zinc fingers. The best conserved position is found four residues 

after the second cysteine; it is generally an aromatic or aliphatic residue. 

-Consensus pattern: C-x(2,4)-C-x(3)-[LIVMFYWC]-x(8)-H-x(3,5)-H 
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[The two C's and two H's are zinc ligands] 

[ 1] Klug A., Rhodes D. 

Trends Biochem. Sci. 12:464-469(1987). 
[ 2] Evans R.M., HoUenberg S.M. 

Cell 52:1-3(1988). 
[ 3] Payre F., Vincent A. 

FEES Lett. 234:245-250(1988). 
[ 4] Miller J., McLachlan A.D., Klug A. 

EMBO J. 4:1609-1614(1985). 
[ 5] Berg J.M. 

Proc. Natl. Acad. Sci. U.S.A. 85:99-102(1988). 
[ 6] Rosenfeld R., Margalit H. 

J. Biomol. Struct. Dyn. 11:557-570(1993). 

727. Zinc finger, C3HC4 type (RING finger) 

A number of eukaryotic and viral proteins contain a conserved cysteine-rich 
domain of 40 to 60 residues (called C3HC4 zinc-finger or 'RING' finger) [1] 
that binds two atoms of zinc, and is probably involved in mediating protein- 
protein interactions. The 3D structure of the zinc ligation system is unique 
to the RING domain and is refer ed to as the "cross-brace" motif. The spacing 
of the cysteines in such a domain is C-x(2)-C-x(9 to 39)-C-x(l to 3)-H-x(2 to 
3)-C-x(2)-C-x(4 to 48)-C-x(2)-C. 

Proteins currently known to include the C3HC4 domain are listed below 
(references are only provided for recently determined sequences). 

- Mammalian V(D)J recombination activating protein (gene RAGl). RAGl 
activates the rearrangement of immunoglobulin and T-cell receptor genes. 

- Mouse rpt-1. Rpt-1 is a trans-acting factor that regulates gene expression 
directed by the promoter region of the interleukin-2 receptor alpha chain 
or the LTR promoter region of HIV-1. 
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- Human rfp. Rfp is a developmentally regulated protein that may function in 
male germ cell development. Recombination of the N-terminal section of rfp 
with a protein tyrosine kinase produces the ret transforming protein. 

- Human 52 Kd Ro/SS-A protein. A protein of unknown function from the Ro/SS-A 
5 ribonucleoprotein complex. Sera from patients with systemic lupus 

erythematosus or primary Sjogren's syndrome often contain antibodies that 
react with the Ro proteins. 

- Human histocompatibility locus protein RINGl. 

- Human PML, a probable transcription factor. Chromosomal translocation of 
1 0 PML with retinoic receptor alpha creates a fusion protein which is the 

cause of acute promyelocytic leukemia (APL). 

- Mammalian breast cancer type 1 susceptibility protein (BRCAl) [El]. 

- Mammalian cbl proto-oncogene. 

- Mammalian bmi-1 proto-oncogene. 

15 - Vertebrate CDK-activating kinase (CAK) assembly factor MATl, a protein that 
stabilizes the complex between the CDK7 kinase and cyclin H (MATl stands 
for 'Menage A Trois'). 

- Mammalian mel-18 protein. Mel-18 which is expressed in a variety of tumor 
cells is a transcriptional repressor that recognizes and bind a specific 

2 0 DNA sequence. 

- Mammalian peroxisome assembly factor-1 (PAF-1) (PMP35), which is somewhat 
involved in the biogenesis of peroxisomes. In humans, defects in PAF-1 are 
responsible for a form of Zellweger syndrome, an autosomal recessive 
disorder associated with peroxisomal deficiencies. 

2 5 - Human MATl protein, which interacts with the CDK7-cyclin H complex. 

- Human RINGl protein. 

- Xenopus XNF7 protein, a probable transcription factor. 

- Trypanosoma protein ESAG-8 (T-LR), which may be involved in the 
postranscriptional regulation of genes in VSG expression sites or may 

3 0 interact with adenylate cyclase to regulate its activity. 

- Drosophila proteins Posterior Sex Combs (Psc) and Suppressor two of zeste 
(Su(z)2). The two proteins belong to the Polycomb group of genes needed to 
maintain the segment-specific repression of homeotic selector genes. 
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- Drosophila protein male-specific msl-2, a DNA -binding protein which is 
involved in X chromosome dosage compensation (the elevation of 
transcription of the male single X chromosome). 

- Arabidopsis thaliana protein COPl which is involved in the regulation of 
photomorphogenesis. 

- Fungal DNA repair proteins RAD5, RAD16, RAD18 and radS. 

- Herpesviruses trans-acting transcriptional protein ICPO/IEllO. This protein 
which has been characterized in many different herpesviruses is a trans- 
activator and/or -repressor of the expression of many viral and cellular 
promoters. 

- Baculoviruses protein CG30. 

- Baculoviruses major immediate early protein (PE-38). 

- Baculoviruses immediate-early regulatory protein IE-N/IE-2. 

- Caenorhabditis elegans hypothetical proteins F54G8.4, R05D3.4 and T02C1.1. 

- Yeast hypothetical proteins YER116c and YKR017c. 

The central region of the domain was selected as a signature pattern 
for the C3HC4 finger. 

-Consensus pattern: C-x-H-x-[LIVMFY]-C-x(2)-C-[LIVMYA] 

[ 1] Borden K.L.B., Freemont P.S. 
Curr. Opin. Struct. Biol. 6:395-401(1996). 

728. Zinc finger C-x8-C-x5-C-x3-H type (and similar). 

729. Zinc finger, CCHC class 

A family of CCHC zinc fingers, mostly from retroviral gag 
proteins (nucleocapsid). Prototype structure is from HIV. 
Also contains members involved in eukaryotic gene 
regulation, such as C. elegans GLH-1. 



Reference No. 



2750-942P 



583 

Structure is an 18-residue zinc finger; no examples of indels 
in the alignment. 

730. Zn-finger in Ran binding protein and others. 

731. ANl-like Zinc finger 

Zinc finger at the C-terminus of Anl Sw!ss:091889 , a ubiquitin-like protein in Xenopus 
laevis. The following pattern describes the zinc finger. C-X2-C-X(9-12)-C-X(l-2)-C-X4-C- 
X2-H-X5-H-X-C Where X can be any amino acid, and numbers in brackets indicate the 
number of residues. 

[1] Linnen JM, Bailey CP, Weeks DL; Gene 1993;128:181-188. 

732. 14-3-3 proteins 

Structure of a 14-3-3 protein and implications for coordination of multiple 
signalling pathways. 

Xiao B, Smerdon SJ, Jones DH, Dodson GG, Soneji Y, Aitken A, Gamblin SJ; 
Nature 1995;376:188-191. 

Crystal structure of the zeta isoform of the 14-3-3 protein. 

Liu D, Bienkowska J, Petosa C, Collier RJ, Fu H, Liddington R; 

Nature 1995;376:191-194. 

Interaction of 14-3-3 with signaling proteins is mediated by the 

recognition of phosphoserine. 

Muslin AJ, Tanner JW, Allen PM, Shaw AS; 

Cell 1996;84:889-897. 

The 14-3-3 protein binds its target proteins with a common site 
located towards the C-terminus. 
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Ichimura T, Ito M, Itagaki C, Takahashi M, Horigome T, Omata S, Ohno S, 
Isobe T 

FEES Lett 1997;413:273-276. 

Molecular evolution of the 14-3-3 protein family. 

Wang W, Shakes DC 

J Mol Evol 1996;43:384-398. 

Function of 14-3-3 proteins. 

Jin DY, Lyu MS, Kozak CA, Jeang KT 

Nature 1996;382:308-308. 

The 14-3-3 proteins [1,2,3] are a family of closely related acidic homodimeric 
proteins of about 30 Kd which were first identified as being very abundant in 
mammalian brain tissues and located preferentially in neurons. The 14-3-3 
proteins seem to have multiple biological activities and play a key role in 
signal transduction pathways and the cell cycle. They interacts with kinases 
such as PKC or Raf-1; they seem to also function as protein-kinase dependent 
activators of tyrosine and tryptophan hydroxylases and in plants they are 
associated with a complex that binds to the G-box promoter elements. 

The 14-3-3 family of proteins are ubiquitously found in all eukaryotic species 
studied and have been sequenced in fungi (yeast BMHl and BMH2, fission yeast 
rad24 and rad25), plants, Drosophila, and vertebrates. The sequences of the 
14-3-3 proteins are extremely well conserved. Two highly conserved regions have 
been selected as signature patterns: the first is a peptide of 11 residues 
located in the N-terminal section; the second, a 20 amino acid region located 
in the C-terminal section. 

-Consensus pattern: R-N-L-[LIV]-S-[VG]-[GA]-Y-[KN]-N-[IVA] 
-Consensus pattern: Y-K-[DE]-S-T-L-I-[IM]-Q-L-[LF]-[RHC]-D-N-[LF]-T-[LS]-W- 
[TAN]-[SAD] 



[ 1] Aitken A. 
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Trends Biochem. Sci. 20:95-97(1995). 
[ 2] Morrison D. 

Science 266:56-57(1994). 
[ 3] Xiao B., Smerdon SJ., Jones D.H., Dodson G.G., Soneji Y., Aitken A., 

Gamblin S.J. 

Nature 376:188-191(1995). 

733. D-isomer specific 2-hydroxyacid dehydrogenases (2 Hacid DH) 

This Pfam covers the Formate dehydrogenase, D-glycerate dehydrogenase and D-lactate 
dehydrogenase families in SCOP. A number of NAD-dependent 2-hydroxyacid dehydrogenases which 
seem to be specific for the D-isomer of their substrate have been shown [1,2,3,4] to be functionally and 
structurally related. These enzymes are listed below. 

- D-lactate dehydrogenase (EC 1.1.1.28), a bacterial enzyme which catalyzes the 
reduction of D-lactate to pyruvate. 

- D-glycerate dehydrogenase (EC 1.1.1.29) (NADH-dependent hydroxy-pyruvate 
reductase), a plant leaf peroxisomal enzyme that catalyzes the reduction of 
hydroxypyruvate to glycerate. This reaction is part of the glycolate pathway of 
photo respiration. 

D-glycerate dehydrogenase from the bacteria Hyphomicrobium methylovorum 
and Methylobacterium extorquens. 

- 3-phosphoglycerate dehydrogenase (EC 1.1.1.95), a bacterial enzyme that 
catalyzes the oxidation of D-3-phosphoglycerate to 3-phosphohydroxypyruvate. 
This reaction is the first committed step in the 'phosphorylated' pathway of serine 
biosynthesis. 

- Erythronate-4-phosphate dehydrogenase (EC 1.1.1 .-) (gene pdxB), a bacterial 
enzyme involved in the biosynthesis of pyridoxine (vitamin B6). 

- D-2-hydroxyisocaproate dehydrogenase (EC 1.1.1.-) (D-hicDH), a bacterial 
enzyme that catalyzes the reversible and stereospecific interconversion between 2- 
ketocarboxylic acids and D-2-hydroxy-carboxylic acids. 

- Formate dehydrogenase (EC 1.2.1.2) (FDH) from the bacteria Pseudomonas sp. 
101 and various fungi [5], 

- Vancomycin resistance protein vanH from Enterococcus faecium; this protein is a 
D-specific alpha-keto acid dehydrogenase involved in the formation of a 
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peptidoglycan which does not terminate by D-alanine thus preventing 
vancomycin binding. 

- Escherichia coli hypothetical protein ycdW. 
Escherichia coli hypothetical protein yiaE. 
Haemophilus influenzae hypothetical protein HI1556. 
Yeast hypothetical protein YEROSlw. 

- Yeast hypothetical protein YIL074w. 

All these enzymes have similar enzymatic activities and are structurally related. Three 
of the most conserved regions of these proteins have been selected to develop patterns. The 
first pattern is based on a glycine -rich region located in the central section of these enzymes; 
this region probably corresponds to the NAD-binding domain. The two other patterns contain 
a number of conserved charged residues, some of which may play a role in the catalytic 
mechanism. 

-Consensus pattern: [LIVMA]-[AG]-[IVT]-[LIVMFY]-[AG]-x-G-[NHKRQGSAC]-[LIV]- 
G-x(13,14)-[LIVfMT]-x(2)-[FYwCTH]-[DNSTK] 

-Consensus pattern: [LIVMFYWA]-[LIVFYWC]-x(2)-[SAC]-[DNQHR]-[IVFA]-[LIVF]-x- 
[LI VF] -[HNI] -x-P-x(4)- [STN]-x(2)- [LIVMF] -x-[GSDN] 

-Consensus pattern: [LMFATC]-[KPQ]-x-[GSTDN]-x-[LIVMFYWR]-[LIVMFYW](2)-N-x- 
[STAGC]-R-[GP]-x-[LIVH]-[LIVMC]-[DNV] 

[1] Grant G.A. Biochem. Biophys. Res. Commun. 165:1371-1374(1989). 

[2] Kochhar S., Hunziker P., Leong-Morgenthaler P.M., Hottinger H. Biochem. Biophys. 

Res. Commun. 184:60-66(1992). 

[3] Ohta T., Taguchi H. J. Biol. Chem. 266:12588-12594(1991). 

[4] Goldberg J.D., Yoshida T., Brick P. J. Mol. Biol. 236:1123-1140(1994). 

[5] Popov V.O., Lamzin V.S. Biochem. J. 301:625-643(1994). 

734. 2-oxo acid dehydrogenases acyltransferase (catalytic domain) 
Refined crystal structure of the catalytic domain of dihysrolipoyl 
transacetylase (E2P) from azotobacter vineelandii at 2.6 angstroms 
resolution. 
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Mattevi A, Obmolova G, Kalk KH, Westphal AH, De Kok A, Hoi WG; 
J Mol Biol 1993;230:1183-1199. 

These proteins contain one to three copies of a lipoyl binding domain 
followed by the catalytic domain. 

735. 3-beta hydroxysteriod dehydrogenase/isomerase family 
Structure and tissue-specific expression of 3 
beta-hydroxysteroid dehydrogenase/5-ene-4-ene isomerase 
genes in human and rat classical and peripheral 
steroidogenic tissues. 

Labrie F, Simard J, Luu-The V, Pelletier G, Belanger A, 
Lachance Y, Zhao HF, Labrie C, Breton N, de Launoit Y, et al 
J Steroid Biochem Mol Biol 1992;41:421-435. 
The enzyme 3 beta-hydroxysteroid dehydrogenase/5-ene-4-ene 
isomerase (3 beta-HSD) catalyzes the oxidation and isomerization 
of 5-ene-3 beta-hydroxypregnene and 5-ene-hydroxyandrostene 
steroid precursors into the corresponding 4-ene-ketosteroids necessary 
for the formation of all classes of steroid hormones. 

736. 3-hydroxyacyl-CoA dehydrogenase 
This family also includes lambda crystallin. 

Structure of L-3-hydroxyacyl-coenzyme A dehydrogenase: 
preliminary chain tracing at 2.8-A resolution. 
Birktoft JJ, Holden HM, Hamlin R, Xuong NH, Banaszak U; 
Proc Natl Acad Sci U S A 1987;84:8262-8266. 

3-hydroxyacyl-CoA dehydrogenase (EC 1.1.1.35) (HCDH) [1] is an enzyme involved 
in fatty acid metabolism, it catalyzes the reduction of 3-hydroxyacyl-CoA to 
3-oxoacyl-CoA. Most eukaryotic cells have 2 fatty-acid beta-oxidation systems, 
one located in mitochondria and the other in peroxisomes. In peroxisomes 
3-hydroxyacyl-CoA dehydrogenase forms, with enoyl-CoA hydratase (ECH) and 
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3,2-trans-enoyl-CoA isomerase (ECI) a multifunctional enzyme where the N- 
terminal domain bears the hydratase/isomerase activities and the C-terminal 
domain the dehydrogenase activity. There are two mitochondrial enzymes: one 
which is monofunctional and the other which is, like its peroxisomal 
counterpart, multifunctional. 

In Escherichia coli (gene fadB) and Pseudomonas fragi (gene faoA) HCDH is part 
of a multifunctional enzyme which also contains an ECH/ECI domain as well as a 
3-hydroxybutyryl-CoA epimerase domain [2]. 

The other proteins structurally related to HCDH are: 

-Bacterial 3-hydroxybutyryl-CoA dehydrogenase (EC 1.1.1.157) which reduces 

3-hydroxybutanoyl-CoA to acetoacetyl-CoA [3]. 
- Eye lens protein lambda-crystallin [4], which is specific to lagomorphes 

(such as rabbit). 

There are two major region of similarities in the sequences of proteins of the 
HCDH family, the first one located in the N-terminal, corresponds to the NAD- 
binding site, the second one is located in the center of the sequence. A signature 
pattern has been derived from this central region. 

-Consensus pattern: [DNE]-x(2)-[GA]-F-[LIVMFY]-x-[NT]-R-x(3)-[PA]-[LIVMFY](2)- 
x(5)-[LIVMFYCT]-[LIVMFY]-x(2)-[GV] 

[ 1] Birktoff J.J., Holden H.M., Hamlin R., Xuong N.-H., Banaszak L.J. 

Proc. Natl. Acad. Sci. U.S.A. 84:8262-8266(1987). 
[ 2] Nakahigashi K., Inokuchi H. 

Nucleic Acids Res. 18:4937-4937(1990). 
[ 3] Mullany P., Clayton C.L., Fallen M.J., Slone R., Al-Saleh A., 

Tabaqchali S. 

FEMS Microbiol. Lett. 124:61-67(1994). 
[ 4] Mulders J.W.M., Hendriks W., Blankesteijn W.M., Bloemendal H., 
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de Jong W.W. 

J. Biol. Chem. 263:15462-15466(1988). 

737. 60s Acidic ribosomal protein 

Proteins PI, P2, and PO, components of the eukaryotic 
ribosome stalk. New structural and functional aspects. 
Remacha M, Jimenez-Diaz A, Santos C, Briones E, Zambrano R, 
Rodriguez Gabriel MA, Guarinos E, Ballesta JP; 
Biochem Cell Biol 1995;73:959-968. 

This family includes archaebacterial L12, eukaryotic PO, PI and P2. 

738. 6-phosphogluconate dehydrogenases 

6-phosphogluconate dehydrogenase (EC 1.1.1.44) (6PGD) catalyzes the third step 
in the hexose monophosphate shunt, the decarboxylating reduction of 
6-phosphogluconate in to ribulose 5-phosphate. 

Prokaryotic and eukaryotic 6PGD are proteins of about 470 amino acids whose 
sequence are highly conserved [1]. A region which has been shown [2], from studies 
of the sheep 6PGD tertiary structure, to be involved in the binding of 6-phosphogluconate 
has been selected as a signature pattern. 

-Consensus pattern: [LIVM]-x-D-x(2)-[GA]-[NQS]-K-G-T-G-x-W 

[ 1] Reizer A., Deutscher J., Saier M.H. Jr., Reizer J. 

Mol. Microbiol. 5:1081-1089(1991). 
[ 2] Adams M.J., Archibald I.G., Bugg C.E., Carne A., Gover S., 

Helliwell J.R., Pickersgill R.W., White S.W. 

EMBO J. 2:1009-1014(1983). 



739. (7tm 1) G-protein coupled receptors [1 to 4,E1,E2] (also called R7G) are an extensive 
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group of hormones, neurotransmitters, odorants and light receptors which 
transduce extracellular signals by interaction with guanine nucleotide- 
binding (G) proteins. The receptors that are currently known to belong to this 
family are listed below. 

5 

- 5-hydroxytryptamine (serotonin) lA to IF, 2A to 2C, 4, 5 A, 5B, 6 and 7 [5]. 

- Acetylcholine, muscarinic-type. Ml to M5. 

- Adenosine Al, A2A, A2B and A3 [6]. 

- Adrenergic alpha-lA to -IC; alpha-2A to -2D; beta-1 to -3 [7]. 
1 0 - Angiotensin II types I and II. 

- Bombesin subtypes 3 and 4. 

- Bradykinin Bl and B2. 

- c3a and C5a anaphylatoxin. 

- Cannabinoid CBl and CB2. 

1 5 - Chemokines C-C CC-CKR-1 to CC-CKR-8. 

- Chemokines C-X-C CXC-CKR-1 to CXC-CKR-4. 

- Cholecystokinin-A and choIecystokinin-B/gastrin. 

- Dopamine Dl to D5 [8]. 

- Endothelin ET-a and ET-b [9]. 

2 0 - fMet-Leu-Phe (fMLP) (N-formyl peptide). 

- Follicle stimulating hormone (FSH-R) [10]. 

- Galanin. 

- Gastrin-releasing peptide (GRP-R). 

- Gonadotropin-releasing hormone (GNRH-R). 

2 5 - Histamine HI and H2 (gastric receptor I). 

- Lutropin-choriogonadotropic hormone (LSH-R) [10]. 

- Melanocortin MCIR to MC5R. 

- Melatonin. 

- Neuromedin B (NMB-R). 

3 0 - Neuromedin K (NK-3R). 

- Neuropeptide Y types 1 to 6. 

- Neurotensin (NT-R). 

- Octopamine (tyramine), from insects. 
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- Odorants [11]. 

- Opioids delta-, kappa- and mu-types [12]. 

- Oxytocin (OT-R). 

- Platelet activating factor (PAF-R). 

- Prostacyclin. 

- Prostaglandin D2. 

- Prostaglandin E2, EPl to EP4 subtypes. 

- Prostaglandin F2. 

- Purinoreceptors (ATP) [13]. 

- Somatostatin types 1 to 5. 

- Substance-K (NK-2R). 

- Substance-P (NK-IR). 

- Thrombin. 

- Thromboxane A2. 

- Thyrotropin (TSH-R) [10]. 

- Thyrotropin releasing factor (TRH-R). 

- Vasopressin Via, Vlb and V2. 

- Visual pigments (opsins and rhodopsin) [14]. 

- Proto-oncogene mas. 

- A number of orphan receptors (whose ligand is not known) from mammals and 
birds. 

- Caenorhabditis elegans putative receptors C06G4.5, C38C10.1, C43C3.2, 
T27D1.3 and ZC84.4. 

- Three putative receptors encoded in the genome of cytomegalovirus: US27, 
US28, and UL33. 

- ECRF3, a putative receptor encoded in the genome of herpesvirus saimiri. 

The structure of all these receptors is thought to be identical. They have 
seven hydrophobic regions, each of which most probably spans the membrane. 
The N-terminus is located on the extracellular side of the membrane and is 
often glycosylated, while the C-terminus is cytoplasmic and generally 
phosphorylated. Three extracellular loops alternate with three intracellular 
loops to link the seven transmembrane regions. Most, but not all of these 
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receptors, lack a signal peptide. The most conserved parts of these proteins 
are the transmembrane regions and the first two cytoplasmic loops. A conserved 
acidic- Arg-aromatic triplet is present in the N-terminal extremity of the 
second cytoplasmic loop [15] and could be implicated in the interaction with G 
proteins. 

To detect this widespread family of proteins, a pattern that contains the conserved 
triplet and that also spans the major part of the third transmembrane helix has 
been developed. 

-Consensus pattern: [GSTALIVMFYWC]-[GSTANCPDE]-{EDPKRH}-x(2)- 
[LIVMNQGA]-x(2)- 

[LIVMFT]-[GSTANC]-[LIVMFYWSTAC]-[DENH]-R-[FYWCSH]-x(2)- 

[LIVM] 

[ 1] Strosberg A.D. 

Eur. J. Biochem. 196:1-10(1991). 
[ 2] Kerlavage A.R. 

Curr. Opin. Struct. Biol. 1:394-401(1991). 
[ 3] Probst W.C., Snyder L.A., Schuster D.I., Brosius J., Sealfon S.C. 

DNA Cell Biol. 11:1-20(1992). 
[ 4] Savarese T.M., Eraser CM. 

Biochem. J. 283:1-9(1992). 
[ 5] Branchek T. 

Curr. Biol. 3:315-317(1993). 
[ 6] Stiles G.L. 

J. Biol. Chem. 267:6451-6454(1992). 
[ 7] Friell T., Kobilka B.K., Lefkowitz R.J., Caron M.G. 

Trends Neurosci. 11:321-324(1988). 
[ 8] Stevens C.F. 

Curr. Biol. 1:20-22(1991). 
[ 9] Sakurai T., Yanagisawa M., Masaki T. 

Trends Pharmacol. Sci. 13:103-107(1992). 
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[10] Salesse R., Remy J.J., Levin J.M., Jallal B., Garnier J. 

Biochimie 73:109-120(1991). 
[11] Lancet D., Ben-Arie N. 

Curr. Biol. 3:668-674(1993). 
[12] Uhl G.R., Childers S., Pasternak G. 

Trends Neurosci. 17:89-93(1994). 
[13] Barnard E.A., Burnstock G., Webb T.E. 

Trends Pharmacol. Sci. 15:67-70(1994). 
[14] Applebury M.L., Hargrave P.A. 

Vision Res. 26:1881-1895(1986). 
[15] Attwood T.K., Eliopoulos E.E., Findlay J.B.C. 

Gene 98:153-159(1991). 

(7tm 1) Visual pigments (opsins) retinal binding site 

Visual pigments [1,2] are the light-absorbing molecules that mediate vision. 
They consist of an apoprotein, opsin, covalently linked to the chromophore 
cis-retinal. Vision is effected through the absorption of a photon by cis- 
retinal which is isomerized to trans-retinal. This isomerization leads to a 
change of conformation of the protein. Opsins are integral membrane proteins 
with seven transmembrane regions that belong to family 1 of G-protein coupled 
receptors. 

In vertebrates four different pigments are generally found. Rod cells, which 
mediate vision in dim light, contain the pigment rhodopsin. Cone cells, which 
function in bright light, are responsible for color vision and contain three 
or more color pigments (for example, in mammals: red, blue and green). 

In Drosophila, the eye is composed of 800 facets or ommatidia. Each 
ommatidium contains eight photoreceptor cells (R1-R8): the Rl to R6 cells are 
outer cells, R7 and R8 inner cells. Each of the three types of cells (R1-R6, 
R7 and R8) expresses a specific opsin. 

Proteins evolutionary related to opsins include squid retinochrome, also known 
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as retinal photoisomerase, which converts various isomers of retinal into 11- 
cis retinal and mammalian retinal pigment epithelium (RPE) RGR [3], a protein 
that may also act in retinal isomerization. 

The attachment site for retinal in the above proteins is a conserved lysine 
residue in the middle of the seventh transmembrane helix. The pattern 
that had been developed includes this residue. 

-Consensus pattern: [LIVMWAC]-[PGC]-x(3)-[SAC]-K-[STALIMR]-[GSACPNV]- 
[STACP]- 

x(2)-[DENF]-[AP]-x(2)-[IY] 

[K is the retinal binding site] 

[ 1] Applebury M.L., Hargrave P.A. 

Vision Res. 26:1881-1895(1986). 
[ 2] Fryxell K.J., Meyerowitz E.M. 

J. Mol. Evol. 33:367-378(1991). 
[ 3] Shen D., Jiang M., Hao W., Tao L., Salazar M., Fong H.K.W. 

Biochemistry 33:13117-13125(1994). 

The following descriptions of protein family functions are not provided by the Pfam or 
Prosite databases. 

740. BAH 

BAH domain. Number of members: 65 

[1] Medline: 97074677. Molecular cloning of polybromo, a nuclear protein containing 
multiple domains including five bromodomains, a truncated HMG-box, and two repeats of 
novel domain. Nicolas RH, Goodwin GH; Gene 1996;175:233-240. 
[2] Medline: 99198739. The BAH (bromo-adjacent homology) domain: a link between 
DNA methylation, replication and transcriptional regulation. Callebaut I, Courvalin J-C, 
Mornon JP; FEBS letts 1999;446:189-193. 
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741. ELM2. 

ELM2 domain. The ELM2 (Egl-27 and MTAl homology 2) domain is a small domain of 
unknown function. Number of members: 10 

742. Euk proin. EUKARYOTIC_PORIN The major protein of the outer mitochondrial 
membrane of eukaryotes is a porin that forms a voltage-dependent anion-selective 
channel (VDAC) that behaves as a general diffusion pore for small hydrophilic molecules [1 
to 4]. The channel adopts an open conformation at low or zero membrane potential and a 
closed conformation at potentials above 30-40 mV. 

This protein contains about 280 amino acids and its sequence is composed of between 12 
to 16 beta-strands that span the mitochondrial outer membrane. Yeast contains two 
members of this family (genes PORl and POR2); vertebrates have at least three members 
(genes VDACl, VDAC2 and VDAC3) [5]. 

A conserved region located at the C-terminal part of these proteins was selected as a 
signature pattern. 

Consensus pattern[YH]-x(2)-D-[SPCAD]-x-[STA]-x(3)-[TAG]-[KR]-[LIVMF]-[DNSTA]- 
[DNS]-x(4)-[GSTAN]-[LIVMA]-x-[LIVMY] 

[ 1] Benz R. Biochim. Biophys. Acta 1197:167-196(1994). 
[ 2] Manella C.A. Trends Biochem. Sci. 17:315-320(1992). 
[ 3] Dihanich M. Experientia 46:146-153(1990). 

[ 4] Forte M., Guy H.R., Mannella C.A. J. Bioenerg. Biomembr. 19:341-350(1987). 

[ 5] Sampson M.J., Lovell R.S., Davison D.B., Craigen W.J. Genomics 36:192-196(1996). 

743. Glyco hydor 19 
Chitinases family 19 signatures 

cross-reference(s) CHITINASE_19_1, CHITINASE_19_2 
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Chitinases (EC 3.2.1.14) [1] are enzymes that catalyze the hydrolysis of the beta-l,4-N- 
acetyl-D-glucosamine linkages in chitin polymers. From the view point of sequence 
similarity chitinases belong to either family 18 or 19 in the classification of glycosyl 
hydrolases [2,E1]. Chitinases of family 19 (also known as classes lA or I and IB or II) 
are enzymes from plants that function in the defense against fungal and insect pathogens 
by destroying their chitin-containing cell wall. Class lA/I and IB/II enzymes differ in the 
presence (lA/I) or absence (IB/II) of a N-terminal chitin-binding domain (see the relevant 
entry <PDOC00025>). The catalytic domain of these enzymes consist of about 220 to 230 
amino acid residues. 

Two highly conserved regions were selected as signature patterns, the first one is located in 
the N-terminal section and contains one of the six cysteines which are conserved in most, 
if not all, of these chitinases and which is probably involved in a disulfide bond. 

Consensus patternC-x(4,5)-F-Y-[ST]-x(3)-[FY]-[LIVMF]-x-A-x(3)-[YF]-x(2)-F-[GSA] 
Consensus pattern[LIVM]-[GSA]-F-x-[STAG](2)-[LIVMFY]-W-[FY]-W-[LIVM] 

[ l]Flach J., Pilet P.-E., Jolles P. Experientia 48:701-716(1992). 
[ 2] Henrissat B. Biochem. J. 280:309-316(1991). 

744. MBD 

Methyl-CpG binding domain 

The Methyl-CpG binding domain (MBD) binds to DNA that contains one or more 
symmetrically methylated CpGs [1]. DNA methylation in animals is associated with 
alterations in chromatin structure and silencing of gene expression. MBD has negligible non- 
specific affinity for DNA. In vitro foot-printing with MeCP2 showed the MBD can protect a 
12 nucleotide region surrounding a methyl CpG pair [1]. MBDs are found in several Methyl- 
CpG binding proteins and also DNA demethylase [2]. Number of members: 11 

[l]Medline: 94232813. Dissection of the methyl-CpG binding domain from the chromosomal 
protein MeCP2. Nan X, Meehan RR, Bird A; Nucleic Acids Res 1993;21:4886-4892. 
[2]Medline: 99158138. A mammalian protein with specific demethylase activity for mCpG 
DNA. Bhattacharya SK, Ramchandani S, Cervoni N, Szyf M; Nature 1999;397:579-583. 
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745. Peptidase CI 

Eukaryotic thiol (cysteine) proteases active sites 

cross-reference(s) THIOL_PROTEASE_CYS; THIOL_PROTEASE_HIS; 
THIOL_PROTEASE_ASN 

Eukaryotic thiol proteases (EC 3.4.22.-) [1] are a family of proteolytic enzymes which 
contain an active site cysteine. Catalysis proceeds through a thioester intermediate and is 
facilitated by a nearby histidine side chain: an asparagine completes the essential catalytic 
triad. The proteases which are currently known to belong to this family are listed below 
(references are only provided for recently determined sequences). 

-Vertebrate lysosomal cathepsins B (EC 3.4.22.1), H (EC 3.4.22.16), L (EC 3.4.22.15), 
and S (EC 3.4.22.27) [2]. 

- Vertebrate lysosomal dipeptidyl peptidase 1 (EC 3.4.14.1) (also known as cathepsin C) 
[2]. 

- Vertebrate calpains (EC 3.4.22.17). Calpains are intracellular calcium- activated thiol 
protease that contain both a N-terminal catalytic domain and a C-terminal calcium-binding 
domain. 

- Mammalian cathepsin K, which seems involved in osteoclastic bone resorption [3]. 

- Human cathepsin O [4]. 

- Bleomycin hydrolase. An enzyme that catalyzes the inactivation of the antitumor drug 
BLM (a glycopeptide). 

- Plant enzymes: barley aleurain (EC 3.4.22.16), EP-B1/B4; kidney bean EP-Cl, rice bean 
SH-EP; kiwi fruit actinidin (EC 3.4.22.14); papaya latex papain (EC 3.4.22.2), 
chymopapain (EC 3.4.22.6), caricain (EC 3.4.22.30), and proteinase IV (EC 3.4.22.25); 
pea turgor-responsive protein 15A; pineapple stem bromelain (EC 3.4.22.32); rape COT44; 
rice oryzain alpha, beta, and gamma; tomato low-temperature induced, Arabidopsis 
thaliana A494, RD19A and RD21A. 

- House-dust mites allergens DerPl and EurMl. 

- Cathepsin B-like proteinases from the worms Caenorhabditis elegans (genes gcp-1, cpr- 
3, cpr-4, cpr-5 and cpr-6). Schistosoma mansoni (antigen SM31) and Japonica (antigen 
SJ31), Haemonchus contortus (genes AC-1 and AC-2), and Ostertagia ostertagi (CP-1 and 
CP-3). 
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- Slime mold cysteine proteinases CPl and CP2. 

- Cruzipain from Trypanosoma cruzi and brucei. 

- Throphozoite cysteine proteinase (TCP) from various Plasmodium species. 

- Proteases from Leishmania mexicana, Theileria annulata and Theileria parva. 

- Baculoviruses cathepsin-like enzyme (v-cath). 

- Drosophila small optic lobes protein (gene sol), a neuronal protein that contains a 
calpain-like domain. 

- Yeast thiol protease BLH1/YCP1/LAP3. 

- Caenorhabditis elegans hypothetical protein C06G4.2, a calpain-like protein. 
Two bacterial peptidases are also part of this family: 

- Aminopeptidase C from Lactococcus lactis (gene pepC) [5]. 

- Thiol protease tpr from Porphyromonas gingivalis. 

Three other proteins are structurally related to this family, but may have lost their 
proteolytic activity. 

- Soybean oil body protein P34. This protein has its active site cysteine replaced by a 
glycine. 

- Rat testin, a Sertoli cell secretory protein highly similar to cathepsin L but with the 
active site cysteine is replaced by a serine. Rat testin should not be confused with mouse 
testin which is a LIM-domain protein (see <PDOC00382>). 

- Plasmodium falciparum serine-repeat protein (SERA), the major blood stage antigen. 
This protein of 111 Kd possesses a C-terminal thiol-protease-like domain [6], but the active 
site cysteine is replaced by a serine. 

The sequences around the three active site residues are well conserved and can be used as 
signature patterns. 

Consensus patternQ-x(3)-[GE]-x-C-[YW]-x(2)-[STAGC]-[STAGCV] [C is 
the active site residue] 
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Note the residue in position 4 of the pattern is almost always cysteine; the only exceptions are 
calpains (Leu), bleomycin hydrolase (Ser) and yeast YCPl (Ser). Note the residue in position 
5 of the pattern is always Gly except in papaya protease IV where it is Glu. 
Consensus pattern[LIVMGSTAN]-x-H-[GSACE]-[LIVM]-x-[LIVMAT](2)-G-x-[GSADNH] 
[H is the active site residue] 

Consensus pattern[FYCH]-[WI]-[LIVT]-x-[KRQAG]-N-[ST]-W-x(3)-[FYW]-G-x(2)-G- 
[LFYW]-[LIVMFYG]-x-[LIVMF] [N is the active site residue] 

Note these proteins belong to family CI (papain-type) and C2 (calpains) in the classification 
of peptidases [7,E1]. 

[ l]Dufour E. Biochimie 70:1335-1342(1988). 

[ 2]Kirschke H., Barrett A.J., Rawlings N.D. Protein Prof. 2:1587-1643(1995). 

[ 3]Shi G.-P., Chapman H.A., Bhairi S.M., Deleeuw C, Reddy V.Y., Weiss S.J. FEES Lett. 

357:129-134(1995). 

[ 4]Velasco G., Ferrando A.A., Puente X.S., Sanchez L.M., Lopez-Otin C. J. Biol. Chem. 
269:27136-27142(1994). 

[ 5]Chapot-Chartier M.P., Nardi M., Chopin M.C., Chopin A., Gripon J.C. Appl. Environ. 
Microbiol. 59:330-333(1993). 

[ 6]Higgins D.G., McConnell D.J., Sharp P.M. Nature 340:604-604(1989). 
[ 7]Rawlings N.D., Barrett A.J. Meth. Enzymol. 244:461-486(1994). 

746. Peptidase M22 

Glycoprotease family signature cross-reference(s) GLYCOPROTEASE 
Glycoprotease (GCP) (EC 3.4.24.57) [1], or o-syaloglycoprotein endopeptidase, 
is a metalloprotease secreted by Pasteurella haemolytica which specifically 
cleaves O-sialoglycoproteins such as glycophorin A. The sequence of GCP is 
highly similar to the following uncharacterized proteins: 

- Escherichia coli hypothetical protein ygjD (ORF-X). 

- Bacillus subtilis hypothetical protein ydiE. 

- Mycobacterium leprae hypothetical protein U229E. 

- Mycobacterium tuberculosis hypothetical protein MtCY78.10. 
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- Synechocystis strain PCC 6803 hypothetical protein slr0807. 

- Methanococcus jannaschii hypothetical protein MJ1130. 

- Haioarcula marismortui hypothetical protein in HSH 3'region. 

- Yeast hypothetical protein YKR038c. 

- Yeast hypothetical protein QRI7. 

One of the conserved regions contains two conserved histidines. It is possible 
that this region is involved in coordinating a metal ion such as zinc. 

Consensus pattern[KR]-[GSAT]-x(4)-[FYWLH]-[DQNGK]-x-P-x-[LIVMFY]-x(3)-H- 
x(2)-[AG]-H-[LIVM] 

Note these proteins belong to family M22 in the classification of 
peptidases [2,E1]- 

[ IJAbdullah K.M., Lo R.Y.C., Mellors A. J. Bacteriol. 173:5597-5603(1991). 
[ 2]Rawlings N.D., Barrett A.J. Meth. Enzymol. 248:183-228(1995). 

747. SAM. SAM domain (Sterile alpha motif) 

It has been suggested that SAM is an evolutionarily conserved protein binding domain that is 
involved in the regulation of numerous developmental processes in diverse eukaryotes. The 
SAM domain can potentially function as a protein interaction module through its ability to 
homo- and heterooligomerise with other SAM domains. Number of members: 81 

[l]Medline: 96100659 SAM: A novel motif in yeast sterile alpha and Drosophila 
polyhomeotic proteins Pouting CP; Prot Sci 1995;4:1928-1930. 

[2]Medline: 97160498 SAM as a protein interaction domain involved in developmental 
regulation. Shultz J, Ponting CP, Hofmann K, Bork P; Prot Sci 1997;6:249-253. 
[3]Medline: 99101382 The crystal structure of an Eph receptor SAM domain reveals a 
mechanism for modular dimerization. Reference Author: Stapleton D, Balan I, Pawson 
T, Sicheri F; Nat Struct Biol 1999;6:44-49. 
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748. Tyrosinase signatures cross-reference(s) TYR0SINASE_1; TYROSINASE_2 
Tyrosinase (EC 1.14.18.1) [1] is a copper monooxygenases that catalyzes the 
hydroxylation of monophenols and the oxidation of o-diphenols to o-quinols. 
This enzyme, found in prokaryotes as well as in eukaryotes, is involved in the 
formation of pigments such as melanins and other polyphenolic compounds. 

Tyrosinase binds two copper ions (CuA and CuB). Each of the two copper ion has 
been shown [2] to be bound by three conserved histidines residues. The regions 
around these copper-binding ligands are well conserved and also shared by some 
hemocyanins, which are copper-containing oxygen carriers from the hemolymph of 
many molluscs and arthropods [3,4]. 

At least two proteins related to tyrosinase are known to exist in mammals: 

- TRP-1 (TYRPl) [5], which is responsible for the conversion of 5,6-dihydro- 
xyindole-2-carboxylic acid (DHICA) to indole-5,6-quinone-2-carboxylic acid. 

- TRP-2 (TYRP2) [6], which is the melanogenic enzyme DOPAchrome tautomerase 
(EC 5.3.3.12) that catalyzes the conversion of DOPAchrome to DHICA. TRP-2 
differs from tyrosinases and TRP-1 in that it binds two zinc ions instead 

of copper [7]. 

Other proteins that belong to this family are: 

- Plants polyphenol oxidases (PPO) (EC 1.10.3.1) which catalyze the oxidation 
of mono- and o-diphenols to o-diquinones [8]. 

- Caenorhabditis elegans hypothetical protein C02C2.1. 

Two signature patterns for tyrosinase and related proteins have been derived 
The first one contains two of the histidines that bind CuA, and is located in 
the N-terminal section of tyrosinase. The second pattern contains a histidine 
that binds CuB, that pattern is located in the central section of the enzyme. 



Consensus pattern H-x(4,5)-F-[LIVMFTP]-x-[FW]-H-R-x(2)-[LM]-x(3)-E 
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[The two H's are copper ligands] 

Consensus patternD-P-x-F-[LIVMFYW]-x(2)-H-x(3)-D [H is a copper 
ligand] 

5 [ IJLerch K. Prog. Clin. Biol. Res. 256:85-98(1988). 

[ 2]Jackman M.P., Hajnal A., Lerch K. Biochem. J. 274:707-713(1991). 
[ 3]Linzen B. Naturwissenschaften 76:206-211(1989). 

[ 4]Lang W.H., van Holde K.E. Proc. Natl. Acad. Sci. U.S.A. 88:244-248(1991). 
[ 5]Kobayashi T., Urabe K., Winder A., Jimenez-Cervantes C, Imokawa G., Brev/ington T., 
1 0 Solano F., Garcia-Borron J.C., Hearing V.J. EMBO J. 13:5818-5825(1994). 

[ 6]Jackson I.J., Chambers D.M., Tsukamoto K., Copeland N.G., Gilbert D.J., Jenkins N.A., 
Hearing V. EMBO J. 11:527-535(1992). 

[ 7]Solano P., Martinez-Liarte J.H., Jimenez-Cervantes C, Garcia-Borron J.C., Lozano J.A. 
Biochem. Biophys. Res. Commun. 204:1243-1250(1994). 
1 5 [ 8]Cary J.W., Lax A.R., Flurkey W.H. Plant Mol. Biol. 20:245-253(1992). 

749. (Mur Ligase) Folylpolyglutamate synthase signatures 

Folylpolyglutamate synthase (EC 6.3.2.17) (FPGS) [1] is the enzyme of folate metabolism 

2 0 that catalyzes ATP-dependent addition of glutamate moieties to tetrahydrofolate. 

Its sequence is moderately conserved between prokaryotes (gene folC) and eukaryotes. 
We developed two signature patterns based on the conserved regions which are rich in 
glycine residues and could play a role in the catalytical 
25 activity and/or in substrate binding. 

Description of pattern(s) and/or profile(s) 

Consensus pattern[LIVMFY]-x-[LlVM]-[STAG]-G-T-[NK]-G-K-x-[ST]-x(7)- [LIVM](2)- 
x(3)-[GSK] 

3 0 Consensus pattern[LIVMFY](2)-E-x-G-[LIVM]-[GA]-G-x(2)-D-x-[GST]-x-[LIVM](2) 

[ l]Shane B., Garrow T., Brenner A., Chen L., Choi Y.J., Hsu J.C., Stover P. Adv. Exp. Med. 
Biol. 338:629-634(1993). 
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750. (Peptidase M3) Neutral zinc metallopeptidases, zinc-binding region signature 
The majority of zinc-dependent metallopeptidases (with the notable exception of the 
5 carboxypeptidases) share a common pattern of primary structure [1,2,3] in the part of their 
sequence involved in the binding of zinc, and can be grouped together as a 
superfamily,known as the metzincins, on the basis of this sequence similarity. They can be 
classified into a number of distinct families [4,E1] which are listed below along with the 
proteases which are currently known to belong to these families. 

10 

Family Ml 

- Bacterial aminopeptidase N (EC 3.4.11.2) (gene pepN). 

- Mammalian aminopeptidase N (EC 3.4.11.2). 

- Mammalian glutamyl aminopeptidase (EC 3.4.11.7) (aminopeptidase A). It may play a 
1 5 role in regulating growth and differentiation of early B-lineage cells. 

- Yeast aminopeptidase yscll (gene APE2). 

- Yeast alanine/arginine aminopeptidase (gene AAPl). 

- Yeast hypothetical protein YIL137c. 

- Leukotriene A-4 hydrolase (EC 3.3.2.6). This enzyme is responsible for the hydrolysis of 
2 0 an epoxide moiety of LTA-4 to form LTB-4; it has been shown that it binds zinc and is 

capable of peptidase activity. 

Family M2 

- Angiotensin-converting enzyme (EC 3.4.15.1) (dipeptidyl carboxypeptidase I) (ACE) the 

2 5 enzyme responsible for hydrolyzing angiotensin I to angiotensin II. There are two forms 

of ACE: a testis-specific isozyme and a somatic isozyme which has two active centers. 

Family M3 

- Thimet oligopeptidase (EC 3.4.24.15), a mammalian enzyme involved in the cytoplasmic 

3 0 degradation of small peptides. 

- Neurolysin (EC 3.4.24.16) (also known as mitochondrial oligopeptidase M or microsomal 
endopeptidase). 
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- Mitochondrial intermediate peptidase precursor (EC 3.4.24.59) (MIP). It is involved the 
second stage of processing of some proteins imported in the mitochondrion. 

- Yeast saccharolysin (EC 3.4.24.37) (proteinase yscD). 

- Escherichia cell and related bacteria dipeptidyl carboxypeptidase (EC 3.4.15.5) (gene 
5 dcp). 

- Escherichia coli and related bacteria oligopeptidase A (EC 3.4.24.70) (gene opdA or prlC). 

- Yeast hypothetical protein YKL134c. 

Family M4 

1 0 - Thermostable thermolysins (EC 3.4.24.27), and related thermolabile neutral proteases 
(bacillolysins) (EC 3.4.24.28) from various species of Bacillus. 

- Pseudolysin (EC 3.4.24.26) from Pseudomonas aeruginosa (gene lasB). 

- Extracellular elastase from Staphylococcus epidermidis. 

- Extracellular protease prtl from Erwinia carotovora. 

15 - Extracellular minor protease smp from Serratia marcescens. 

- Vibriolysin (EC 3.4.24.25) from various species of Vibrio. 

- Protease prtA from Listeria monocytogenes. 

- Extracellular proteinase proA from Legionella pneumophila, 

20 Family M5 

- Mycolysin (EC 3.4.24.31) from Streptomyces cacaoi. 

Family M6 

- Immune inhibitor A from Bacillus thuringiensis (gene ina). Ina degrades two classes of 
25 insect antibacterial proteins, attacins and cecropins. 

Family M7 

- Streptomyces extracellular small neutral proteases 
30 Family M8 

- Leishmanolysin (EC 3.4.24.36) (surface glycoprotein gp63), a cell surface protease from 
various species of Leishmania. 
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Family M9 

- Microbial collagenase (EC 3.4.24.3) from Clostridium perfringens and Vibrio 
alginolyticus. 

5 Family MlOA 

- Serraiysin (EC 3.4.24.40), an extracellular metalloprotease from Serratia. 

- Alkaline metalloproteinase from Pseudomonas aeruginosa (gene aprA). 

- Secreted proteases A, B, C and G from Erwinia chrysanthemi. 

- Yeast hypothetical protein YILlOSw. 

10 

Family MlOB 

- Mammalian extracellular matrix metalloproteinases (known as matrixins) [5]: MMP-1 (EC 
3.4.24.7) (interstitial collagenase), MMP-2 (EC 3.4.24.24) (72 Kd gelatinase), MMP-9 (EC 
3.4.24.35) (92 Kd gelatinase), MMP-7 (EC 3.4.24.23) (matrylisin), MMP-8 (EC 3.4.24.34) 

15 (neutrophil collagenase), MMP-3 (EC 3.4.24.17) (stromelysin-1), MMP-10 (EC 3.4.24.22) 
(stromelysin-2), and MMP-11 (stromelysin-3), MMP-12 (EC 3.4.24.65) (macrophage 
metalloelastase). 

- Sea urchin hatching enzyme (envelysin) (EC 3.4.24.12). A protease that allows the 
embryo to digest the protective envelope derived from the egg extracellular matrix. 

2 0 - Soybean metalloendoproteinase 1. 

Family Mil 

- Chlamydomonas reinhardtii gamete lytic enzyme (OLE). 

2 5 Family M12A 

- Astacin (EC 3.4.24.21), a crayfish endoprotease. 

- Meprin A (EC 3.4.24.18), a mammalian kidney and intestinal brush border 
metalloendopeptidase. 

- Bone morphogenic protein 1 (BMP-1), a protein which induces cartilage and bone 

3 0 formation and which expresses metalloendopeptidase activity. The Drosophila homolog 

of BMP-1 is the dorsal-ventral patterning protein toUoid. 

- Blastula protease 10 (BPIO) from Paracentrotus lividus and the related protein SpAN 
from Strongylocentrotus purpuratus. 
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- Caenorhabditis elegans protein toh-2. 

- Caenorhabditis elegans hypothetical protein F42A10.8. 

- Choriolysins L and H (EC 3.4.24.67) (also known as embryonic hatching proteins LCE 
and HCE) from the fish Oryzias lapides. These proteases participates in the breakdown 

5 of the egg envelope, which is derived from the egg extracellular matrix, at the time of 
hatching. 

Family M12B 

- Snake venom metalloproteinases [6]. This subfamily mostly groups proteases that act in 
10 hemorrhage. Examples are: adamalysin II (EC 3.4.24.46), atrolysin C/D (EC 

3.4.24.42), atrolysin E (EC 3.4.24.44), fibrolase (EC 3.4.24.72), trimerelysin I (EC 
3.4.25.52) and II (EC 3.4.25.53). 

- Mouse cell surface antigen MS2. 

15 Family M13 

- Mammalian neprilysin (EC 3.4.24.11) (neutral endopeptidase) (NEP). 

- Endothelin-converting enzyme 1 (EC 3.4.24.71) (ECE-1), which process the precursor of 
endothelin to release the active peptide. 

- Kell blood group glycoprotein, a major antigenic protein of erythrocytes. The Kell protein 
2 0 is very probably a zinc endopeptidase. 

- Peptidase O from Lactococcus iactis (gene pepO). 

Family M27 

- Clostridial neurotoxins, including tetanus toxin (TeTx) and the various botulinum toxins 

2 5 (BoNT). These toxins are zinc proteases that block neurotransmitter release by 

proteolytic cleavage of synaptic proteins such as synaptobrevins, syntaxin and SNAP-25 
[7,8]. 

Family M30 

3 0 - Staphylococcus hyicus neutral metalloprotease. 



Family M32 
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- Thermostable carboxypeptidase 1 (EC 3.4.17.19) (carboxypeptidase Taq), an enzyme 
from Thermus aquaticus which is most active at high temperature. 

Family M34 

5 - Lethal factor (LF) from Bacillus anthracis, one of the three proteins composing the 
anthrax toxin. 

Family M35 

- Deuterolysin (EC 3.4.24.39) from Penicillium citrinum and related proteases from various 
1 0 species of Aspergillus. 

Family M36 

- Extracellular elastinolytic metalloproteinases from Aspergillus. 

1 5 From the tertiary structure of thermolysin, the position of the residues acting as zinc 
ligands and those involved in the catalytic activity are known. Two of the zinc ligands are 
histidines which are very close together in the sequence; C-terminal to the first histidine is 
a glutamic acid residue which acts as a nucleophile and promotes the attack of a water 
molecule on the carbonyl carbon of the substrate. A signature pattern which includes the 

2 0 two histidine and the glutamic acid residues is sufficient to detect this superfamily of 
proteins. 

Description of pattern(s) and/or profile(s) 

Consensus pattern[GSTALIVN]-x(2)-H-E-[LIVMFYW]-{DEHRKjP}-H-x- 

2 5 [LIVMFYWGSPQ] [The 

two H's are zinc ligands] [E is the active site residue] 

Sequences known to belong to this class detected by the patternALL, 

except for members of families M5, M7 amd Mil. 

Other sequence(s) detected in SWISS-PROT55; including Neurospora 

3 0 crassa conidiation-specific protein 13 which could be a 

zinc-protease. 

[ l]Jongeneel C.V., Bouvier J., Bairoch A. 
FEES Lett. 242:211-214(1989). 
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[ 2]Murphy G.J.P., Murphy G., Reynolds J.J. 
FEES Lett. 289:4-7(1991). 

[ 3]Bode W., Grams F., Reinemer P., Gomis-Rueth F.-X., Baumann U., McKay 
D.B., Stoecker W. 
5 Zoology 99:237-246(1996). 

[ 4]Rawlings N.D., Barrett A.J. 
Meth. Enzymol. 248:183-228(1995). 
[ 5]Woessner J. Jr. 
FASEB J. 5:2145-2154(1991). 
1 0 [ 6]Hite L.A., Fox J.W., Bjarnason J.B. 
[ 7]Montecucco C., Schiavo G. 
Trends Biochem. Sci. 18:324-327(1993). 
[ 8]Niemann H., Blasi J., Jahn R. 
Trends Cell Biol. 4:179-185(1994). 

15 

751. PseudoU_synt_l 

tRNA pseudouridine synthase is involved in the formation of pseudouridine at the anticodon 
stem and loop of transfer-RNAs Pseudouridine is an isomer of uridine (5-(beta-D- 
2 0 ribofuranosyl) uracil, and id the most abundant modified nucleoside found in all cellular 
RNAs. The TruA-like proteins also exhibit a conserved sequence with a strictly conserved 
aspartic acid, likely involved in catalysis. Number of members: 25 

[l]Medline: 98254513. Transfer RNA-pseudouridine synthetase Pusl of Saccaromyces 
2 5 cerevisiae contains one atom of zinc essential for its native conformation and tRNA 

recognition. Arluison V, Hountondji C, Robert B, Grosjean H; Biochemistry 1998;37:7268- 
7276. 

30 752. EPS? synthase signatures 

EPSP synthase (3-phosphoshikimate 1-carboxyvinyltransferase) (EC 2.5.1.19) catalyzes the 
sixth step in the biosynthesis from chorismate of the aromatic amino acids (the shikimate 
pathway) in bacteria (gene aroA), plants and fungi (where it is part of a multifunctional 
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enzyme which catalyzes five consecutive steps in this pathway) [1]. EPSP synthase has been 
extensively studied as it is the target of the potent herbicide glyphosate which inhibits the 
enzyme. 

5 The sequence of EPSP from various biological sources shows that the structure of the enzyme 
has been well conserved throughout evolution. Two conserved regions were selected as 
signature patterns. The first pattern corresponds to a region that is part of the active site and 
which is also important for the resistance to glyphosate [2]. The second pattern is located in 
the C-terminal part of the protein and contains a conserved lysine which seems to be 
1 0 important for the activity of the enzyme. 

Description of pattern(s) and/or profile(s) 

Consensus pattern[LIVM]-x(2)-[GN]-N-[SA]-G-T-[STA]-x-R-x-[LIVMY]-x-[GSTA] 
1 5 Consensus pattern[KR]-x-[KH]-E-[CST]-[DNE]-R-[LIVM]-x-[STA]-[LIVMC]-x(2)-[EN]- 
[LI VMF] -X - [KRA] - [LI VMF] -G 

[ l]Stallings W.C., Abdel-Megid S.S., Lim L.W., Shieh U.S., Dayringer H.E., Leimgruber 
N.K., Stegeman R.A., Anderson K.S., Sikorski J.A., Padgette S.R., Kishore G.M. Proc. 
2 0 Natl. Acad. Sci. U.S.A. 88:5046-5050(1991). 

[ 2]Padgette S.R., Re D.B., Gaser C.S., Eicholtz D.A., Frazier R.B., Hironaka CM., Levine 
E.B., Shah D.M., Fraley R.T., Kishore G.M. J. Biol. Chem. 266:22364-22369(1991). 

2 5 753. Glyco_hydro_18 

Glycosyl hydrolases family 18. Number of members: 173 

[IJMedline: 95219379. Crystal structure of a bacterial chitinase at 2.3 A resolution. Perrakis 
A, Tews I, Dauter Z, Oppenheim AB, Chet L Wilson KS, Vorgias CE; Structure 
1994;2:1169-1180. 

30 

754. Esterase 
Putative esterase 
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This family contains Esterase D Swiss:P10768. However it is not clear if all members of the 
family have the same function. This family is possibly related to the COesterase family. 
Number of members: 36 



755. (HMA) Heavy-metal-associated domain 

A conserved domain of about 30 amino acid residues has been found [1] in a number of 
proteins that transport or detoxify heavy metals. This domain contains two conserved 
cysteines that could be involved in the binding of these metals. The domain has been 
1 0 termed Heavy-Metal-Associated (HMA). It has been found in: 

- A variety of cation transport ATPases (El -E2 ATPases) (see <PDOC00139>). The 
human copper ATPAses ATP7A and ATP7B which are respectively involved in 
Menke's and Wilson's diseases. ATP7A and ATP7B both contain 6 tandem copies of the 
HMA domain. The copper ATPases CCC2 from budding yeast, copA from 

1 5 Enterococcus faecalis and synA from Synechococcus contain one copy of the HMA 

domain. The cadmium ATPases cadA from Bacillus firmus and from plasmid pI258 
from Staphylococcus aureus also contain a single HMA domain, while a chromosomal 
Staphylococcus aureus cadA contains two copies. Other, less characterized ATPases 
that contain the HMA domain are: fix! from Rhizobium meliloti, pacS from 

2 0 Synechococcus strain PCC 7942), Mycobacterium leprae ctpA and ctpB and 

Escherichia coli hypothetical protein yhhO. In all these ATPases the HMA domain(s) 
are located in the N-terminal section. 

- Mercuric reductase (EC 1.16.1.1) (gene merA) which is generally encoded by plasmids 
carried by mercury-resistant Gram-negative bacteria. Mercuric reductase is a class-1 

25 pyridine nucleotide-disulphide oxidoreductase (see <PDOC00073>). There is 

generally one HMA domain (with the exception of a chromosomal merA from 
Bacillus strain RC607 which has two) in the N-terminal part of merA. 
Mercuric transport protein periplasmic component (gene merP), also encoded by 
plasmids carried by mercury-resistant Gram-negative bacteria. It seems to be a 

3 0 mercury scavenger that specifically binds to one Hg(2-h) ion and which passes it to 

the mercuric reductase via the merT protein. The N-terminal half of merP is a HMA 
domain. 

Helicobacter pylori copper-binding protein copP. 
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- Yeast protein ATX 1 [2], which could act in the transport and/or partitioning of 
copper. 

The consensus pattern for HMA spans the complete domain. 

5 

Description of pattern(s) and/or profile(s) 

Consensus pattern[LIVN]-x(2)-[LIVMFA]-x-C-x-[STAGCDNH]-C-x(3)-[LIVFG]-x(3)- 
[LIV]-x(9,ll)-[IVA]-x-[LVFYS] [The two Cs probably bind metals] 

1 0 [ l]Bull P.C., Cox D.W. Trends Genet. 10:246-252(1994). 

[ 2]Lin S.-J., Culotta V.L. Proc. Natl. Acad. Sci. U.S.A. 92:3784-3788(1995). 

756. (Peptidase MIO) Matrixins cysteine switch 
PROSITE cross-reference(s): CYSTEINE_SWITCH 

15 Mammalian extracellular matrix metalloproteinases (EC 3.4.24.-), also known as matrixins 
[1] (see <PDOC00129>), are zinc-dependent enzymes. They are secreted by cells in an 
inactive form (zymogen) that differs from the mature enzyme by the presence of an N- 
terminal propeptide. A highly conserved octapeptide is found two residues downstream of 
the C-terminal end of the propeptide. This region has been shown to be involved in 

20 autoinhibition of matrixins [2,3]; a cysteine within the octapeptide chelates the active site 
zinc ion, thus inhibiting the enzyme. This region has been called the 'cysteine switch' or 
'autoinhibitor region'. 

A cysteine switch has been found in the following zinc proteases: 

25 - MMP-1 (EC 3.4.24.7) (interstitial collagenase). 

- MM? -2 (EC 3.4.24.24) (72 Kd gelatinase). 

- MMP-3 (EC 3.4.24.17) (stromelysin-1). 

- MMP-7 (EC 3.4.24.23) (matrilysin). 

- MMP-8 (EC 3.4.24.34) (neutrophil collagenase). 
3 0 - MMP-9 (EC 3.4.24.35) (92 Kd gelatinase). 

- MMP-10 (EC 3.4.24.22) (stromelysin-2). 

- MMP-11 (EC 3.4.24.-) (stromelysin-3). 

- MMP-12 (EC 3.4.24.65) (macrophage metalloelastase). 
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- MMP-13 (EC 3.4.24.-) (collagenase 3). 

- MMP-14 (EC 3.4.24.-) (membrane-type matrix metalliproteinase 1). 

- MMP-15 (EC 3.4.24.-) (membrane-type matrix metalliproteinase 2). 

- MMP-16 (EC 3.4.24.-) (membrane-type matrix metalliproteinase 3). 
5 - Sea urchin hatching enzyme (EC 3.4.24.12) (envelysin) [4]. 

- Chlamydomonas reinhardtii gamete lytic enzyme (OLE) [5]. 

Description of pattern(s) and/or profile(s) 

Consensus patternP-R-C-[GN]-x-P-[DR]-[LIVSAPKQ] [C chelates the zinc ion] 

10 

[ IJWoessner J. Jr. FASEB J. 5:2145-2154(1991). 

[ 2] Sanchez-Lopez R., Nicholson R., Gesnel M.C., Matrisian L.M., Breathnach R. J. Biol. 
Chem. 263:11892-11899(1988). 

[ 3]Park A.J., Matrisian L.M., Kells A.F., Pearson R., Yuan Z., Navre M. J. Biol. Chem. 
15 266:1584-1590(1991). 

[ 4]Lepage T., Cache C. EMBO J. 9:3003-3012(1990). 

[ 5]Kinoshita T., Fukuzawa H., Shimada T., Saito T., Matsuda Y. Proc. Natl. Acad. Sci. 
U.S.A. 89:4693-4697(1992). 

20 

757. (Peptidase S8) Serine proteases, subtilase family, active sites 

PROSITE cross-reference(s): PS00136; SUBTILASE_ASP, PS00137; SUBTILASE_HIS, 
PS00138; SUBTILASE_SER 

Subtilases [1,2] are an extensive family of serine proteases whose catalytic activity is 

2 5 provided by a charge relay system similar to that of the trypsin family of serine proteases 

but which evolved by independent convergent evolution. The sequence around the 
residues involved in the catalytic triad (aspartic acid, serine and histidine) are completely 
different from that of the analogous residues in the trypsin serine proteases and can be 
used as signatures specific to that category of proteases. 

3 0 The subtilase family currently includes the following proteases: 

- Subtilisins (EC 3.4.21.62), these alkaline proteases from various Bacillus species have 
been the target of numerous studies in the past thirty years. 

- Alkaline elastase YaB from Bacillus sp. (gene ale). 
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- Alkaline serine exoprotease A from Vibrio alginolyticus (gene pro A). 

- Aqualysin I from Thermus aquaticus (gene psti). 

- AspA from Aeromonas salmonicida. 

- Bacillopeptidase F (esterase) from Bacillus subtilis (gene bpf). 
5 - C5A peptidase from Streptococcus pyogenes (gene scpA). 

- Cell envelope-located proteases PI, PII, and PHI from Lactococcus lactis. 

- Extracellular serine protease from Serratia marcescens. 

- Extracellular protease from Xanthomonas campestris. 

- Intracellular serine protease (ISP) from various Bacillus. 

1 0 - Minor extracellular serine protease epr from Bacillus subtilis (gene epr). 

- Minor extracellular serine protease vpr from Bacillus subtilis (gene vpr). 

- Nisin leader peptide processing protease nisP from Lactococcus lactis. 

- Serotype-specific antigene 1 from Pasteurella haemolytica (gene ssal). 

- Thermitase (EC 3.4.21.66) from Thermoactinomyces vulgaris. 

15 - Calcium-dependent protease from Anabaena variabilis (gene prcA). 

- Halolysin from halophilic bacteria sp. 172pl (gene hly). 

- Alkaline extracellular protease (AEP) from Yarrowia lipolytica (gene xpr2). 

- Alkaline proteinase from Cephalosporium acremonium (gene alp). 

- Cerevisin (EC 3.4.21.48) (vacuolar protease B) from yeast (gene PRBl). 

2 0 - Cuticle-degrading protease (prl) from Metarhizium anisopliae. 

- KEX-1 protease from Kluyveromyces lactis. 

- Kexin (EC 3.4.21.61) from yeast (gene KEX-2). 

- Oryzin (EC 3.4.21.63) (alkaline proteinase) from Aspergillus (gene alp). 

- Proteinase K (EC 3.4.21.64) from Tritirachium album (gene proK). 
25 - Proteinase R from Tritirachium album (gene proR). 

- Proteinase T from Tritirachium album (gene proT). 

- Subtilisin-like protease III from yeast (gene YSP3). 

- Thermomycolin (EC 3.4.21.65) from Malbranchea sulfurea. 

- Furin (EC 3.4.21.85), neuroendocrine convertases 1 to 3 (NEC-1 to -3) and PACE4 

3 0 protease from mammals, other vertebrates, and invertebrates. These proteases are involved 

in the processing of hormone precursors at sites comprised of pairs of basic amino acid 
residues [3]. 

- Tripeptidyl-peptidase II (EC 3.4.14.10) (tripeptidyl aminopeptidase) from Human. 
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- Prestalk-specific proteins tagB and tagC from slime mold [4]. Both proteins consist of two 
domains: a N-terminal subtilase catalytic domain and a C-terminal ABC transporter domain 
(see <PDOC00185>). 

5 Description of pattern(s) and/or profile(s) 

Consensus pattern[STAIV]-x-[LIVMF]-[LIVM]-D-[DSTA]-G-[LIVMFC]-x(2,3)-[DNH] [D 
is the active site residue] 

Consensus patternH-G-[STM]-x-[VIC]-[STAGC]-[GS]-x-[LIVMA]-[STAGCLV]-[SAGM] 
[H is the active site residue] 
1 0 Consensus patternG-T-S-x-[SA]-x-P-x(2)-[STAVC]-[AG] [S is the active site residue] 

Note if a protein includes at least two of the three active site signatures, the probability of it 
being a serine protease from the subtilase family is 100% 
Note these proteins belong to family S8 in the classification of 
peptidases [5, El]. 

15 

[ l]Siezen R.J., de Vos W.M., Leunissen J.A.M., Dijkstra B.W. Protein Eng. 4:719- 
737(1991). 

[ 2]Siezen R.J. (In) Proceeding subtilisin symposium, Hamburg, (1992). 
[ 3]Barr P.J. Cell 66:1-3(1991). 
20 [ 4]Shaulsky G., Kuspa A., Loomis W.F.; Genes Dev. 9:1111-1122(1995). 
[ 5]Rawlings N.D., Barrett A.J. Meth. Enzymol. 244:19-61(1994). 

758. (SSB) Single-strand binding protein family signatures 
2 5 PROSITE cross-reference(s): PS00735; SSB_1,PS00736; SSB_2 

The Escherichia coli single-strand binding protein [1] (gene ssb), also known as the helix- 
destabilizing protein, is a protein of 177 amino acids. It binds tightly, as a homotetramer, to 
single-stranded DNA (ss-DNA) and plays an important role in DNA replication, 
recombination and repair. 

30 

Closely related variants of SSB are encoded in the genome of a variety of large self- 
transmissible plasmids. SSB has also been characterized in bacteria such as Proteus mirabilis 
or Serratia marcescens. 
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Eukaryotic mitochondrial proteins that bind ss-DNA and are probably involved in 
mitochondrial DNA replication are structurally and evolutionary related to prokaryotic SSB. 
Proteins currently known to belong to this subfamily are listed below [2]. 
5 - Mammalian protein Mt-SSB (PI 6). 

- Xenopus Mt-SSBs and Mt-SSBr. 

- Drosophila MtSSB. 

- Yeast protein RIMl. 

1 0 Two signature patterns have been developed for these proteins. The first is a conserved 

region in the N-terminal section of the SSB's. The second is a centrally located region which, 
in Escherichia cell SSB, is known to be involved in the binding of DNA. 

Description of pattern(s) and/or profile(s) 
1 5 Consensus pattern[LIVMF]-[NST]-[KRT]-[LIVM]-x-[LIVMF](2)-G-[NHRK]-[LIVM]- 
[GST]-x-[DET] 

Consensus patternT-x-W-[HY]-[RNS]-[LIVM]-x-[LIVMF]-[FY]-[NGKR] 

[ IJMeyer R.R., Laine P.S. Microbiol. Rev. 54:342-380(1990). 
2 0 [ 2]Stroumbakis N.D., Li Z., Tolias P.P. Gene 143:171-177(1994). 

759. KDPG and KHG aldolases active site signatures 

PROSITE cross-reference(s): PS00159; ALD0LASE_KDPG_KHG_1, PS00160; 
ALD0LASE_KDPG_KHG_2 

25 

4-hydroxy-2-oxoglutarate aldolase (EC 4.1.3.16) (KHG-aldolase) catalyzes the 
interconversion of 4-hydroxy-2-oxoglutarate into pyruvate and glyoxylate. Phospho-2- 
dehydro-3-deoxygluconate aldolase (EC 4.1.2.14) (KDPG-aldolase) catalyzes the 
interconversion of 6-phospho-2-dehydro-3-deoxy-D-gluconate into pyruvate and 
30 glyceraldehyde 3-phosphate. 

These two enzymes are structurally and functionally related [1]. They are both homotrimeric 
proteins of approximately 220 amino-acid residues. They are class I aldolases whose catalytic 
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mechanism involves tlie formation of a Schiff-base intermediate between the substrate and 
the epsilon-amino group of a lysine residue. In both enzymes, an arginine is required for 
catalytic activity. 

5 Two signature patterns were developed for these enzymes. The first one contains the active 
site arginine and the second, the lysine involved in the Schiff-base formation. 

Description of pattern(s) and/or profile(s) 

Consensus patternG-[LIVM]-x(3)-E-[LIV]-T-[LF]-R [R is the active site residue] 
1 0 Consensus patternG-x(3)-[LIVMF]-K-[LF]-F-P-[SA]-x(3)-G [K is involved in Schiff-base 
formation] 

[ 1] Vlahos C J., Dekker E.E. J. Biol. Chem. 263:11683-11691(1988). 

15 760. AP endonucleases family 1 signatures. PROSITE cross-reference(s): FS00726; 
AP_NUCLEASE_F1_1, PS00727; AP_NUCLEASE_F1_2, PS00728; 
AP_NUCLEASE_F1_3 

DNA damaging agents such as the antitumor drugs bleomycin and neocarzinostatin or those 
20 that generate oxygen radicals produce a variety of lesions in DNA. Amongst these is base- 
loss which forms apurinic/apyrimidinic (AP) sites or strand breaks with atypical 3'termini. 
DNA repair at the AP sites is initiated by specific endonuclease cleavage of the 
phosphodiester backbone. Such endonucleases are also generally capable of removing 
blocking groups from the 3'terminus of DNA strand breaks. 

25 

AP endonucleases can be classified into two families on the basis of sequence similarity. 
Family 1 groups the enzymes listed below [1]. 

- Escherichia coli exonuclease III (EC 3.1.11.2) (gene xthA). 

30 - Streptococcus pneumoniae and Bacillus subtilis exonuclease A (gene exoA). 

- Mammalian AP endonuclease 1 (API) (EC 4.2.99.18). 

- Drosophila recombination repair protein 1 (gene Rrpl). 

- Arabidopsis thaliana apurinic endonuclease-redox protein (gene arp). 
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Except for Rrpl and arp, these enzymes are proteins of about 300 amino-acid residues. 
Rrpl and arp both contain additional and unrelated sequences in their N-terminal section 
(about 400 residues for Rrpl and 270 for arp). 

5 

Three signature patterns were developed for this family of enzymes. The patterns are based 
on the most conserved regions. The first pattern contains a glutamate which has been 
shown [2], in the Escherichia coli enzyme to bind a divalent metal ion such as magnesium or 
manganese 

10 

Consensus pattern[APF]-D-[LIVMF](2)-x-[LIVM]-Q-E-x-K [E binds a divalent metal ion] 
Consensus patternD-[ST] - [FY] -R- [ KH] -x(7,8)- [ FYW] -[ST] - [FYW] (2) 
Consensus patternN-x-G-x-R-[LIVM]-D-[LIVMFYH]-x-[LV]-x-S 

15 [ 1] Barzilay G., Hickson I.S. BioEssays 17:713-719(1995). 

[ 2] Mol CD., Kuo C.-F., Thayer M.M., Cunningham R.P., Tainer J.A. Nature 374:381- 
386(1995). 

761. (ER)Enhancer of rudimentary signature, PROSITE cross-reference(s): PS01290; ER 

20 

The Drosophila protein 'enhancer of rudimentary' (gene (e(r)) is a small protein of 104 
residues whose function is not yet clear. From an evolutionary point of view, it is highly 
conserved [1] and has been found to exist in probably all multicellular eukaryotic 
organisms. It has been proposed that this protein plays a role in the cell cycle. 

25 

A conserved region in the central part of the protein was selected as as signaure pattern. 
Consensus patternY-D-I-[SA]-x-L-[FY]-x-F-[IV]-D-x(3)-D-[LIV]-S 



30 



[ 1] Gelsthorpe M., Pulumati M., McCallum C, Dang-Vu K., Tsubota S.I. Gene 186:189- 
195(1997). 
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762. (ETF alpha) Electron transfer flavoprotein alpha-subunit signature, PROSITE cross- 
reference(s): PS00696; ETF_ALPHA 

The electron transfer flavoprotein (ETF) [1,2] serves as a specific electron acceptor for 
5 various mitochondrial dehydrogenases. ETF transfers electrons to the main respiratory 

chain via ETF-ubiquinone oxidoreductase. ETF is an heterodimer that consist of an alpha 
and a beta subunit and vv^hich bind one molecule of FAD per dimer. A similar system also 
exists in some bacteria. 

1 0 The alpha subunit of ETF is a protein of about 32 Kd which is structurally related to the 
bacterial nitrogen fixation protein fixB which could play a role in a redox process and feed 
electrons to ferredoxin. 

Other related proteins are: 

15 

- Escherichia coli hypothetical protein ydiR. 

- Escherichia coli hypothetical protein ygcQ. 

A highly conserved region which is located in the C-terminal section was selected as a 
2 0 signature pattern for these proteins. 

Consensus pattern [LI]-Y-[LIVM]-[AT]-x-G-[IV]-[SD]-G-x-[IV]-Q-H-x(2)-G-x(6)-[IV]-x- 
A-[IV]-N 

2 5 [1] Finocchiaro G., Ikeda Y., Ito M., Tanaka K. Prog. Clin. Biol. Res. 321:637-652(1990). 

[ 2] Tsai M.H., Saier M.H. Jr. Res. Microbiol. 146:397-404(1995). 

763. (lectin c) C-type lectin domain signature and profile 

PROSITE cross-reference(s): PS00615; C_TYPE_LECTIN_1, PS50041; 

3 0 C_TYPE_LECTIN_2 

A number of different families of proteins share a conserved domain which was first 
characterized in some animal lectins and which seem to function as a calcium-dependent 
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carbohydrate-recognition domain [1,2,3]. This domain, which is known as the C-type lectin 
domain (CTL) or as the carbohydrate-recognition domain (CRD), consists of about 110 to 
130 residues. There are four cysteines which are perfectly conserved and involved in two 
disulfide bonds. A schematic representation of the CTL domain is shown below. 

+ -1- 

I I 

xcxxxxcxxxxxxxCxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxCxxxxWxCxxxxCx 

+- — -I- + + 

'C: conserved cysteine involved in a disulfide bond, 
'c': optional cysteine involved in a disulfide bond. 
'*': position of the pattern. 

The categories of proteins, in which the CTL domain has been found, are listed below. 

Type-II membrane proteins where the CTL domain is located at the C-terminal extremity of 
the proteins: 

- Asialoglycoprotein receptors (ASGPR) (also known as hepatic lectins) [4]. The ASGPR's 
mediate the endocytosis of plasma glycoproteins to which the terminal sialic acid residue 
in their carbohydrate moieties has been removed. 

- Low affinity immunoglobulin epsilon Fc receptor (lymphocyte IgE receptor), which plays 
an essential role in the regulation of IgE production and in the differentiation of B cells. 

- Kupffer cell receptor. A receptor with an affinity for galactose and fucose, that could 
be involved in endocytosis. 

- A number of proteins expressed on the surface of natural killer T-cells: NKG2, NKR-Pl, 
YEl/88 (Ly-49), CD69 and on B-cells: CD72, LyB-2. The CTL- domain in these proteins is 
distantly related to other CTL-domains; it is unclear whether they are likely to bind 
carbohydrates. 
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Proteins that consist of an N-terminal collagenous domain followed by a CTL- domain [5], 
these proteins are sometimes called 'collectins': 

- Pulmonary surfactant-associated protein A (SP-A). SP-A is a calcium- 
dependent protein that binds to surfactant phospholipids and contributes to 
lower the surface tension at the air-liquid interface in the alveoli of the 
mammalian lung. 

- Pulmonary surfactant-associated protein D (SP-D). 

- Conglutinin, a calcium-dependent lectin-like protein which binds to a yeast 
cell wall extract and to immune complexes through the complement component 
(iC3b). 

- Mannan-binding proteins (MBP) (also known as mannose -binding proteins). 
MBP's bind mannose and N-acetyl-D-glucosamine in a calcium-dependent 
manner. 

- Bovine collectin-43 (CL-43). 

Selectins (or LEC-CAM) [6,7]. Selectins are cell adhesion molecules implicated in the 
interaction of leukocytes with platelets or vascular endothelium. Structurally, selectins 
consist of a long extracellular domain, followed by a transmembrane region and a short 
cytoplasmic domain. The extracellular domain is itself composed of a CTL-domain, 
followed by an EGF-like domain and a variable number of SCR/Sushi repeats. Known 
selectins are: 

- Lymph node homing receptor (also known as L-selectin, leukocyte adhesion 
molecule-1, (LAM-1), leu-8, gp90-mel, or LECAM-1) 

- Endothelial leukocyte adhesion molecule 1 (ELAM-1, E-selectin or LECAM-2). 
The ligand recognized by ELAM-1 is sialyl-Lewis x. 

- Granule membrane protein 140 (GMP-140, P-selectin, PADGEM, CD62, or LECAM- 
3). The ligand recognized by GMP-140 is Lewis x. 

Large proteoglycans that contain a CTL-domain followed by one copy of a SCR/ Sushi 
repeat, in their C-terminal section: 
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- Aggrecan (cartilage-specific proteoglycan core protein). This proteoglycan 
is a major component of the extracellular matrix of cartilagenous tissues 
where it has a role in the resistance to compression. 

- Brevican. 

- Neurocan. 

- Versican (large fibroblast proteoglycan), a large chondroitin sulfate 
proteoglycan that may play a role in intercellular signalling. 

In addition to the CTL and Sushi domains, these proteins also contain, in their N-terminal 
domain, an Ig-like V-type region, two or four link domains (see <PDOC00955>) and up to 
two EGF-like repeats. 

Two type-I membrane proteins: 

- Mannose receptor from macrophages. This protein mediates the endocytosis of 
glycoproteins by macrophages in several recognition and uptake processes. 
Its extracellular section consists of a fibronectin type II domain followed 

by eight tandem repeats of the CTL domain. 

- 180 Kd secretory phospholipase A2 receptor (PLA2-R). A protein whose 
structure is highly similar to that of the mannose receptor. 

- DEC-205 receptor. This protein is used by dendritic cells and thymic 
epithelial cells to capture and endocytose diverse carbohydrate-binding 
antigens and direct them to antigen-processing cellular compartiments. DEC- 
205 extracellular section consists of a fibronectin type II domain followed 
by ten tandem repeats of the CTL domain. 

- Silk moth hemocytin, an humoral lectin which is involved in a self-defence 
mechanism. It is composed of 2 FA58C domains (see <PDOC00988>), a CTL 
domain, 2 VWFC domains (see <PDOC00928), and a CTCK (see <PDOC00912>). 

Various other proteins that uniquely consist of a CTL domain: 

- Invertebrate soluble galactose -binding lectins. A category to which belong 
a humoral lectin from a flesh fly; echinoidin, a lectin from the coelomic 
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fluid of a sea urchin; BRA-2 and BRA-3, two lectins from the coelomic fluid 
of a barnacle, a lectin from the tunicate Polyandrocarpa misakiensis and a 
newt oviduct lectin. The physiological importance of these lectins is not 
yet known but they may play an important role in defense mechanisms. 
5 - Pancreatic stone protein (PSP) (also known as pancreatic thread protein 
(FTP), or reg), a protein that might act as an inhibitor of spontaneous 
calcium carbonate precipitation. 

- Pancreatitis associated protein (PAP), a protein that might be involved in 
the control of bacterial proliferation. 

1 0 - Tetranectin, a plasma protein that binds to plasminogen and to isolated 
kringle 4. 

- Eosinophil granule major basic protein (MBP), a cytotoxic protein. 

- A galactose specific lectin from a rattlesnake. 

- Two subunits of a coagulation factor IX/factor X-binding protein (IX/X-bp), 
15 a snake venom anticoagulant protein which binds with factors IX and X in 

the presence of calcium. 

- Two subunits of a phospholipase A2 inhibitor from the plasma of a snake 
(PLI-A and PLI-B). 

- A lipopolysaccharide-binding protein (LPS-BP) from the hemolymph of a 

2 0 cockroach [8]. 

- Sea raven antifreeze protein (AFP) [9]. 

As a signature pattern for this domain, the C-terminal region with its three conserved 
cysteines was selected. 

25 

Consensus patternC-[LIVMFYATG]-x(5,12)-[WL]-x-[DNSR]-x(2)-C-x(5,6)- 
[FYWLIVSTA]-[LIVMSTA]-C [The three C's are involved in disulfide 
bonds] 

Note all CTL domains have five Trp residues before the second Cys, 

3 0 with the exception of tunicate lectin and cockroach LPS-BP which 

have Leu. 

Note this documentation entry is linked to both a signature pattern 
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and a profile. As the profile is much more sensitive than the 
pattern, you should use it if you have access to the necessary 
software tools to do so. 

5 [ 1] Drickamer K. J. Biol. Chem. 263:9557-9560(1988). 

[ 2] Drickamer K. Prog. Nucleic Acid Res. Mol. Biol. 45:207-232(1993). 
[ 3] Drickamer K. Curr. Opin. Struct. Biol. 3:393-400(1993). 
[ 4] Spiess M. Biochemistry 29:10009-10018(1990). 

[ 5] Weis W.I., Kahn R., Fourme R., Drickamer K., Hendrickson W.A. Science 254:1608- 
10 1615(1991). 

[ 6] Siegelman M. Curr. Biol. 1:125-128(1991). 

[ 7] Lasky L.A. Science 238:964-969(1992). 

[ 8] Jomori T., Natori S. J. Biol. Chem. 266:13318-13323(1991). 

[ 9] Ng N.F.L., Hew C.-L. J. Biol. Chem. 267:16069-16075(1992). 

15 

764. (SRCR) Speract receptor repeated domain signature 
PROSITE cross-reference(s): PS00420; SPERACT_RECEPTOR, 

The receptor for the sea urchin egg peptide speract is a transmembrane glycoprotein of 
2 0 500 amino acid residues [1]. Structurally it consists of a large extracellular domain of 450 
residues, followed by a transmembrane region and a small cytoplasmic domain of 12 amino 
acids. The extracellular domain contains four repeats of a 115 amino acids domain. There are 
17 positions that are perfectly conserved in the four repeats, among them are six cysteines, 
six glycines, and three glutamates. 

25 

Such a domain is also found, once, in the C-terminal section of mammalian macrophage 
scavenger receptor type I [2], a membrane glycoproteins implicated in the pathologic 
deposition of cholesterol in arterial walls during atherogenesis. 

30 The signature pattern that was derived spans part of the N-terminal section of the domain and 
contains 8 of the 17 conserved residues. 



Consensus patternG-x(5)-G-x(2)-E-x(6)-W-G-x(2)-C-x(3)-[FYW]-x(8)-C-x(3)-G 
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[ 1] Dangott J.J., Jordan J.E., Bellet R.A., Garbers D.L. Proc. Natl. Acad. Sci. U.S.A. 
86:2128-2132(1989). 

[ 2] Freeman M., Ashkenas J., Rees D.J., Kingsley D.M., Copeland N.G., Jenkins N.A., 
5 Krieger M. Proc. Natl. Acad. Sci. U.S.A. 87:8810-8814(1990). 

765. Bac_surface_Ag 
Bacterial surface antigen 

This entry includes the following surface antigens; D15 antigen from H.influenzae, OMA87 
10 from P.multocida, OMP85 from N.meningitidis and N.gonorrhoeae. Number of members: 
14 

[l]Medline: 95255676. The sequencing of the 80-kDa D15 protective surface antigen of 
Haemophilus influenzae. Flack FS, Loosmore S, Chong P, Thomas WR; Gene 1995;156:97- 
15 99. 

[2] Medline: 96333354. Cloning, sequencing, expression, and protective capacity of the 
oma87 gene encoding the Pasteurella multocida 87-kilodalton outer membrane antigen. 
Ruffolo CO, Adler B; Infect Immun 1996;64:3161-3167. 

2 0 766. BRCAl C Terminus (BRCT) domain 

The BRCT domain is found predominantly in proteins involved in cell cycle checkpoint 
functions responsive to DNA damage. It has been suggested that the Retinoblastoma protein 
contains a divergent BRCT domain, this has not been included in this family. The BRCT 
domain of XRCCl forms a homodimer in the crystal structure Medline:99016060. This 

2 5 suggests that pairs of BRCT domains 

associate as homo- or heterodimers. Number of members: 131 

[1] Medline: 96259550. BRCAl protein products ...Functional motifs... Koonin EV, Altschul 
SF, Bork P; Nature Genet 1996;13:266-268. 

3 0 [2] MedUne: 97153217. From BRCAl to RAPl: A widespread BRCT module closely 

associated with DNA repair Callebaut I, Mornon JP; Febs lett 1997;400:25-30. 
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[3] Medline: 97186552. A superfamily of conserved domains in DNA damage responsive cell 
cycle checkpoint proteins Bork P, Hofmann K, Bucher P, Neuwald AF, Altschul SF, Koonin 
EV; Faseb J 1997;11:68-76. 

[4] Medline: 97402527. Gapped BLAST and PSI-BLAST: a new generation of protein 
database search programs. Altschul SF, Madden TL, Schaffer AA, Zhang J, Zhang Z, Miller 
W, Lipman DJ; Nucleic Acids Res 1997;25:3389-3402. 

[5] Medline: 99016060. Structure of an XRCCl BRCT domain: a new protein-protein 
interaction module. Zhang X, Morera S, Bates PA, Whitehead PC, Coffer AI, Hainbucher K, 
Nash RA, Sternberg MJ, Lindahl T, Freemont PS; 

767. Kappa casein 

Kappa-casein is a mammalian milk protein involved in a number of important physiological 
processes. In the gut, the ingested protein is split into an insoluble peptide (para kappa- 
casein) and a soluble hydrophilic glycopeptide (caseinomacropeptide). Caseinomacropeptide 
is responsible for increased efficiency of digestion, prevention of neonate hypersensitivity to 
ingested proteins, and inhibition of gastric pathogens. Number of members: 56 

[1] Medline: 98072500, Nucleotide sequence evolution at the kappa-casein locus: evidence 
for positive selection within the family Bovidae. Ward TJ, Honeycutt RL, Derr JN; Genetics 
1997;147:1863-1872. 

768. Chitinases family 18 active site 
PROSITE cross-reference(s) CHITINASE IS 

Chitinases (EC 3.2.1.14) [1] are enzymes that catalyze the hydrolysis of the beta-l,4-N- 
acetyl-D-glucosamine linkages in chitin polymers. From the view point of sequence 
similarity chitinases belong to either family 18 or 19 in the classification of glycosyl 
hydrolases [2,E1]. Chitinases of family 18 (also known as classes III or V) groups a variety 
of proteins: 
a) Chitinases from: 

- Prokaryotes such as Alteromonas, Bacillus, Serratia, Streptomyces, etc. 

- Plants such as Arabidopsis, cucumber, bean, tobacco, etc. 

- Fungi such as Aphanocladium, Rhizopus, Saccharomyces, etc. 
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- Nematode (Brugia malayi). 

- Insects (Manduca sexta). 

- Baculoviruses (Autographa Californica Nuclear Polyhedrosis virus). 
5 b) Other proteins: 

- Hevamine, a rubber tree protein with chitinase and lysozyme activities. 

- Kluyveromyces lactis killer toxin alpha subunit, which acts as a chitinase. 

- Flavobacterium and Streptomyces endo-beta-N-acetylglucosaminidases (EC 3.2.1.96). 
10 - Mamnaalian di-N-acetylchitobiase which is involved in the degradation of asparagine- 

linked glycoproteins. 

- Human cartilage glycoprotein Gp-39. 

- Jack bean concanavalin B (conB), a protein that has lost its catalytic activity. 

1 5 Site directed mutagenesis experiments [3] and crystallographic data [4,5] have shown that a 
conserved glutamate is involved in the catalytic mechanism and probably acts as a proton 
donor. This glutamate is at the extremity of the best conserved region in these proteins. 

Consensus pattern[LIVMFY]-[DN]-G-[LIVMF]-[DN]-[LIVMF]-[DN]-x-E [E is the active 
2 0 site residue] 

[ 1] Flach J., Pilet P.-E., Jolles P. Experientia 48:701-716(1992). 
[ 2] Henrissat B. Biochem. J. 280:309-316(1991). 

[ 3] Watanabe T., Kohori K., Miyashita K., Fujii T., Sakai H., Uchida M., Tanaka H. J. Biol. 
25 Chem. 268:18567-18572(1993). 

[ 4] Perrakis A., Tews I., Dauter Z., Oppenheim A.B., Chet I., Wilson K.S., Vorgias C.E. 
Structure 2:1169-1180(1994). 

[ 5] van Scheltinga A.C.T., Kalk K.H., Beintema J.J., Dijkstra B.W. Structure 2:1181- 
1189(1994). 

30 

769. gag_pl7. gag gene protein pl7 (matrix protein). 

The matrix protein forms an icosahedral shell associated with the inner membrane of the 
mature immunodeficiency virus. Number of members: 1598 
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[1] Medline: 95055757. Three-dimensional structure of the human immunodeficiency virus 
type 1 matrix protein. Massiah MA, Starich MR, Paschall C, Summers MF, Christensen AM, 
Sundquist WI; J Mol Biol 1994;244:198-223. 

770. GDA1/CD39 family of nucleoside phosphatases signature 
PROSITE cross-reference(s); GDA1_CD39_NTPASE 

A number of nucleoside diphosphate and triphosphate hydrolases as well as some yet 
uncharacterized proteins have been found to belong to the same family [1,2]. This family 
currently consist of: 

- Yeast guanosine-diphosphatase (EC 3.6.1.42) (GDPase) (gene GDAl). GDAl is a golgi 
integral membrane enzyme that catalyzes the hydrolysis of GDP to GMP. 

-Potato apyrase (EC 3.6.1.5) (adenosine diphosphatase) (ADPase). Apyrase acts on both 
ATP and ADP to produce AMP. 

- Mammalian vascular ATP-diphosphohydrolase (EC 3.6.1.5) (also known as lymphoid 
cell activation antigen CD39). 

- Toxoplasma gondii nucleoside-triphosphatases (EC 3.6.1.15) (NTPase). NTPase 
hydrolyses various nucleoside triphosphates to produce the corresponding nucleoside 
mono- and diphosphates. This enzyme is secreted into the invaded host cell into the 
parasitophorous vacuole, a specialized compartment where the parasite intracellulary 
resides. 

- Pea nucleoside-triphosphatases (EC 3.6.1.15) (NTPase). 

- Caenorhabditis elegans hypothetical protein C33H5.14. 

- Caenorhabditis elegans hypothetical protein R07E4.4. 

- Yeast chromosome V hypothetical protein YEROOSw. 

The above uncharacterized proteins all seem to be membrane-bound. 

All these proteins share a number of conserved domains. The best conserved of these 

domains have been selected. It is located in the central section of the 

proteins. 
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Consensus pattern[LIVM]-x-G-x(2)-E-G-x-[FY]-x-[FW]-[LIVA]-[TAG]-x-N-[HY] 

[ 1] Handa M., Guidotti G. Biochem. Biophys. Res. Commun. 218:916-923(1996). 
[ 2] Vasconcelos E.G., Ferreira S.T., de Carvalho T.M.U., de Souza W., Kettlun A.M., 
Mancilla M., Valenzuela M.A., Verjovski -Almeida S. J. Biol. Chem. 271:22139- 
22145(1996). 

771. GTP cyclohydrolase I signatures 

PROSITE cross-reference(s); GTP_CYCL0HYDR0L_1_1, GTP_CYCLOHYDROL_l_2 
GTP cyclohydrolase I (EC 3.5.4.16) catalyzes the biosynthesis of formic acid and 
dihydroneopterin triphosphate from GTP. This reaction is the first step in the biosynthesis of 
tetrahydrofolate in prokaryotes, of tetrahydrobiopterin in vertebrates, and of pteridine- 
containing pigments in insects. 

GTP cyclohydrolase I is a protein of from 190 to 250 amino acid residues. The comparison 
of the sequence of the enzyme from bacterial and eukaryotic sources shows that the 
structure of this enzyme has been extremely well conserved throughout evolution [1]. 

Two conserved regions were selected as signature patterns. The first contains a perfectly 
conserved tetrapeptide which is part of the GTP-binding pocket [2], the second region also 
contains conserved residues involved in GTP-binding. 

Consensus pattern[DEN]-[LIVM](2)-x(2)-[KRNQ]-[DEN]-[LIVM]-x(3)-[ST]-x-C-E- H-H 
Consensus pattern[SA]-x-[RK]-x-Q-[LIVM]-0-E-[RN]-[LI]-[TSN] 

[ 1] Maier J., Witter K., Guetlich M., Ziegler I., Werner T., Ninnemann H. Biochem. 
Biophys. Res. Commun. 212:705-711(1995). 

[ 2] Nar H., Huber R., Meining W., Schmid C, Weinkauf S., Bacher A. Structure 3:459- 
466(1995). 

772. IlvC. Acetohydroxy acid isomeroreductase 



Reference No. 



2750-942P 



629 

Acetohydroxy acid isomeroreductase catalyses the conversion of acetohydroxy acids into 
dihydroxy valerates. This reaction is the second in the synthetic pathway of the essential 
branched side chain anaino acids valine and isoleucine. Number of members: 29 

[1] Medline: 97361822. The crystal structure of plant acetohydroxy acid isomeroreductase 
complexed with NADPH, two magnesium ions and a herbicidal transition state analog 
determined at 1.65 A resolution. Biou V, Dumas R, Cohen-Addad C, Douce R, Job D, Pebay- 
Peyroula E; EMBO J 1997;16:3405-3415. 

773. Prokaryotic membrane lipoprotein lipid attachment site 
PROSITE cross-reference(s); PROKAR_LIPOPROTEIN 

In prokaryotes, membrane lipoproteins are synthesized with a precursor signal peptide, 
which is cleaved by a specific lipoprotein signal peptidase (signal peptidase II). The 
peptidase recognizes a conserved sequence and cuts upstream of a cysteine residue to which 
a glyceride-fatty acid lipid is attached [1]. Some of the proteins known to undergo such 
processing currently include (for recent listings see [1,2,3]): 

- Major outer membrane lipoprotein (murein-lipoproteins) (gene Ipp). 

- Escherichia coli lipoprotein-28 (gene nlpA). 

- Escherichia coli lipoprotein-34 (gene nlpB). 

- Escherichia coli lipoprotein nlpC. 

- Escherichia coli lipoprotein nlpD. 

- Escherichia coli osmotically inducible lipoprotein B (gene osmB). 

- Escherichia coli osmotically inducible lipoprotein E (gene osmE). 

- Escherichia coli peptidoglycan-associated lipoprotein (gene pal). 

- Escherichia coli rare lipoproteins A and B (genes rplA and rplB). 

- Escherichia coli copper homeostasis protein cutF (or nlpE). 

- Escherichia coli plasmids traT proteins. 

- Escherichia coli Col plasmids lysis proteins. 

- A number of Bacillus beta-lactamases. 

- Bacillus subtilis periplasmic oligopeptide-binding protein (gene oppA). 

- Borrelia burgdorferi outer surface proteins A and B (genes ospA and ospB). 

- Borrelia hermsii variable major protein 21 (gene vmp21) and 7 (gene vmp7). 

- Chlamydia trachomatis outer membrane protein 3 (gene omp3). 
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- Fibrobacter succinogenes endoglucanase cel-3. 

- Haemophilus influenzae proteins Pal and Pep. 

- Klebsiella puUulunase (gene pulA). 

- Klebsiella puUulunase secretion protein pulS. 

- Mycoplasma hyorhinis protein p37. 

- Mycoplasma hyorhinis variant surface antigens A, B, and C (genes vlpABC). 

- Neisseria outer membrane protein H.8. 

- Pseudomonas aeruginosa lipopeptide (gene IppL). 

- Pseudomonas solanacearum endoglucanase egl. 

- Rhodopseudomonas viridis reaction center cytochrome subunit (gene cytC). 

- Rickettsia 17 Kd antigen. 

- Shigella flexneri invasion plasmid proteins mxiJ and mxiM. 

- Streptococcus pneumoniae oligopeptide transport protein A (gene amiA). 

- Treponema pallidium 34 Kd antigen. 

- Treponema pallidium membrane protein A (gene tmpA). 

- Vibrio harveyi chitobiase (gene chb). 

- Yersinia virulence plasmid protein yscJ. 

- Halocyanin from Natrobacterium pharaonis [4], a membrane associated copper- binding 
protein. This is the first archaebacterial protein known to be modified in such a fashion). 

From the precursor sequences of all these proteins, we derived a consensus pattern and a 
set of rules to identify this type of post-translational modification. 

Consensus pattern{DERK}(6)-[LIVMFWSTAG](2)-[LIVMFYSTAGCQ]-[AGS]-C [C is the 
lipid attachment site] Additional rules: 1) The cysteine must be between positions 15 and 35 
of the sequence in consideration. 2) There must be at least one Lys or one Arg in the first 
seven positions of the sequence. 

[ 1] Hayashi S., Wu H.C. J. Bioenerg. Biomembr. 22:451-471(1990). 

[ 2]Klein P., Somorjai R.L., Lau P.C.K. Protein Eng. 2:15-20(1988). 
[ 3]von Heijne G. Protein Eng. 2:531-534(1989). 
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[ 4]Mattar S., Scharf B., Kent S.B.H., Rodewald K., Oesterhelt D., Engelhard M. J. Biol. 
Chem. 269:14939-14945(1994). 

774. Aminoacyl-transfer RNA synthetases class-II signatures 
5 PROSITE cross-reference(s); AA_TRNA_LIGASE_II_1 ; AA_TRNA_LIGASE_II_2 
Aminoacyl-tRNA synthetases (EC 6.1.1.-) [1] are a group of enzymes which activate 
amino acids and transfer them to specific tRNA molecules as the first step in protein 
biosynthesis. In prokaryotic organisms there are at least twenty different types of 
aminoacyl-tRNA synthetases, one for each different amino acid. In eukaryotes there are 
1 0 generally two aminoacyl-tRNA synthetases for each different amino acid: one cytosolic 

form and a mitochondrial form. While all these enzymes have a common function, they are 
widely diverse in terms of subunit size and of quaternary structure. 

The synthetases specific for alanine, asparagine, aspartic acid, glycine, histidine, lysine, 
1 5 phenylalanine, proline, serine, and threonine are referred to as class-II synthetases [2 to 6] 
and probably have a common folding pattern in their catalytic domain for the binding of 
ATP and amino acid which is different to the Rossmann fold observed for the class I 
synthetases [7]. 

2 0 Class-II tRNA synthetases do not share a high degree of similarity, however at least three 

conserved regions are present [2,5,8]. Signature patterns from two of these regions have been 
derived. 

Consensus pattern[FYH]-R-x-[DE]-x(4,12)-[RH]-x(3)-F-x(3)-[DE] 

2 5 Consensus pattern[GSTALVF]-{DENQHRKP}-[GSTA]-[LIVMF]-[DE]-R-[LIVMF]-x- 

[LIVMSTAG]-[LIVMFY] 

[ l]Schimmel P. Annu. Rev. Biochem. 56:125-158(1987). 
[ 2]Delarue M., Moras D. BioEssays 15:675-687(1993). 

3 0 [ 3]Schimmel P. Trends Biochem. Sci. 16:1-3(1991). 

[ 4]Nagel G.M., Doolittle R.F. Proc. Natl. Acad. Sci. U.S.A. 88:8121-8125(1991). 
[ 5]Cusack S., Haertlein M., Leberman R. Nucleic Acids Res. 19:3489-3498(1991). 
[ 6]Cusack S. Biochimie 75:1077-1081(1993). 
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[ 7]Cusack S., Berthet-Colominas C, Haertlein M., Nassar N., Leberman R. Nature 347:249- 
255(1990). 

[ 8]Leveque F., Plateau P., Dessen P., Blanquet S. Nucleic Acids Res. 18:305-312(1990). 

775. X. Trans-activation protein X 

This protein is found in hepadnaviruses where it is indispensable for replication. Number of 
members: 91 

776. Thymidylate synthase active site 

Thymidylate synthase (EC 2.1.1.45) [1,2] catalyzes the reductive methylation of 
dUMP to dTMP with concomitant conversion of 5,10-methylenetetrahydrofolate to 
dihydrofolate. Thymidylate synthase plays an essential role in DNA synthesis and is an 
important target for certain chemotherapeutic drugs. 

Thymidylate synthase is an enzyme of about 30 to 35 Kd in most species except in 
protozoan and plants where it exists as a bifunctional enzyme that includes a dihydrofolate 
reductase domain. 

A cysteine residue is involved in the catalytic mechanism (it covalently binds the 5,6- 
dihydro-dUMP intermediate). The sequence around the active site of this enzyme is 
conserved from phages to vertebrates. 

Consensus patternR-x(2)-[LIVM]-x(3)-[FW]-[QN]-x(8,9)-[LV]-x-P-C-[HAVM]-x(3)- 
[QMT]-[FYW]-x-[LV] [C is the active site residue] 

[ 1] Benkovic S.J. Annu. Rev. Biochem. 49:227-251(1980). 

[ 2] Ross P., O'Gara P., Condon S. Appl. Environ. Microbiol. 56:2156-2163(1990). 

777. Glycosyl hydrolases family 31 signatures 

It has been shown [1,2,3,E1] that the following glycosyl hydrolases can be, on the 
basis of sequence similarities, classified into a single family: 

- Lysosomal alpha-glucosidase (EC 3.2.1.20) (acid maltase) is a vertebrate glycosidase 
active at low pH, which hydrolyzes alpha(l->4) and alpha(l->6) linkages in glycogen, 
maltose, and isomaltose. 

- Alpha-glucosidase (EC 3.2.1.20) from the yeast Candida tsukunbaensis. 
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- Alpha-glucosidase (EC 3.2.1.20) (gene malA) from the archebacteria Sulfolobus 
solfataricus. 

- Intestinal sucrase-isomaltase (EC 3.2.1.48 / EC 3.2.1.10) is a vertebrate membrane-bound, 
multifunctional enzyme complex which hydrolyzes sucrose, maltose and isomaltose. The 
sucrase and isomaltase domains of the enzyme are homologous (41% of amino acid identity) 
and have most probably evolved by duplication. 

- Glucoamylase 1 (EC 3.2.1.3) (glucan 1,4-alpha-glucosidase) from various fungal species. 

- Yeast hypothetical protein YBR229c. 

- Fission yeast hypothetical protein SpAC30D11.01c. 

An aspartic acid has been implicated [4] in the catalytic activity of sucrase, 
isomaltase, and lysosomal alpha-glucosidase. The region around this active residue is highly 
conserved and can be used as a signature pattern. A second region, which contains two 
conserved cysteines, has been used as an additional signature pattern. 

Consensus pattern [GF]-[LIVMF]-W-x-D-M-[NSA]-E [D is the active site residue] 
Consensus pattern G-[AV]-D-[LIVMTA]-C-G-[FY]-x(3)-[ST]-x(3)-L-C-x-R-W-x(2)-[LV]- 
[GSA]-[SA]-F-x-P-F-x-R-[DN] 

[ 1] Henrissat B. Biochem. J. 280:309-316(1991). 

[ 2] Kinsella B.T., Hogan S., Larkin A., Cantwell B.A. Eur. J. Biochem. 202:657-664(1991). 
[ 3] Nairn H.Y., Niermann T., Kleinhans U., Hollenberg CP., Strasser A.W.M. FEES Lett. 
294:109-112(1991). 

[ 4] Hermans M.M.P., Kroos M.A., van Beeumen J., Oostra B.A., Reuser A.J.J. J. Biol. 
Chem. 266:13507-13512(1991). 

778. Urease signatures 

Urease (EC 3.5.1.5) is a nickel-binding enzyme that catalyzes the hydrolysis of urea 
to carbon dioxide and ammonia [1]. Historically, it was the first enzyme to be crystallized (in 
1926). It is mainly found in plant seeds, microorganisms and invertebrates. In plants, urease 
is a hexamer of identical chains. In bacteria [2], it consists of either two or three different 
subunits (alpha, beta and gamma). 
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Urease binds two nickel ions per subunit; four histidine, an aspartate and a 
carbamated-lysine serve as ligands to these metals; an additional histidine is involved in the 
catalytic mechanism [3]. 

As signatures for this enzyme, a region was selected that contains two histidine that 
bind one of the nickel ions and the region of the active site histidine. 

Consensus pattern T-[AY]-[GA]-[GAT]-[LIVM]-D-x-H-[LIVM]-H-x(3)-P [The two H's bind 
nickel] 

Consensus pattern [LIVM](2)-[CT]-H-[HN]-L-x(3)-[LIVM]-x(2)-D-[LIVM]-x-F-A [H is the 
active site residue] 

[ 1] Takishima K., Suga T., Mamiya G. Eur. J. Biochem. 175:151-165(1988). 

[ 2] Mobley H.L.T., Husinger R.P. Microbiol. Rev. 53:85-108(1989). 

[ 3] Jabri E., Carr M.B., Hausinger R.P., Karplus P.A. Science 268:998-1004(1995). 

779. Tyrosine specific protein phosphatases signature and profiles 

Tyrosine specific protein phosphatases (EC 3.1.3.48) (PTPase) [1 to 5] are enzymes 
that catalyze the removal of a phosphate group attached to a tyrosine residue. These enzymes 
are very important in the control of cell growth, proliferation, differentiation and 
transformation. Multiple forms of PTPase have been characterized and can be classified into 
two categories: soluble PTPases and transmembrane receptor proteins that contain PTPase 
domain(s). The currently known PTPases are listed below: 

Soluble PTPases. 

- PTPNl (PTP-IB). 

- PTPN2 (T-cell PTPase; TC-PTP). 

- PTPN3 (HI) and PTPN4 (MEG), enzymes that contain an N-terminal band 4.1- like 
domain (see <PDOC00566>) and could act at junctions between the membrane and 
cytoskeleton. 

- PTPN5 (STEP). 

- PTPN6 (PTP-IC; HCP; SHP) and PTPNl 1 (PTP-2C; SH-PTP3; Syp), enzymes which 
contain two copies of the SH2 domain at its N-terminal extremity. The Drosophila protein 
corkscrew (gene csw) also belongs to this subgroup. 
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- PTPN7 (LC-PTP; Hematopoietic protein-tyrosine phosphatase; HePTP). 

- PTPN8 (70Z-PEP). 

- PTPN9 (MEG2). 

- PTPN12 (PTP-Gl; PTP-P19). 
5 - Yeast PTPl. 

- Yeast PTP2 which may be involved in the ubiquitin-mediated protein degradation 
pathway. 

- Fission yeast pypl and pyp2 which play a role in inhibiting the onset of mitosis. 

- Fission yeast pyp3 which contributes to the dephosphorylation of cdc2. 
1 0 - Yeast CDC14 which may be involved in chromosome segregation. 

- Yersinia virulence plasmid PTPAses (gene yopH). 

- Autographa californica nuclear polyhedrosis virus 19 Kd PTPase. 

Dual specificity PTPases. 
15 - DUSPl (PTPNIO; MAP kinase phosphatase-1 ; MKP-1); which dephosphorylates MAP 
kinase on both Thr-183 and Tyr-185. 

- DUSP2 (PAC-1), a nuclear enzyme that dephosphorylates MAP kinases ERKl and ERK2 
on both Thr and Tyr residues. 

- DUSP3 (VHR). 
2 0 - DUSP4 (HVH2). 

- DUSP5 (HVH3). 

- DUSP6 (Pystl; MKP-3). 

- DUSP7 (Pyst2; MKP-X). 

- Yeast MSGS, a PTPase that dephosphorylates MAP kinase FUS3. 

2 5 - Yeast YVHl. 

- Vaccinia virus HI PTPase; a dual specificity phosphatase. 

Receptor PTPases. 

Structurally, all known receptor PTPases, are made up of a variable length 

3 0 extracellular domain, followed by a transmembrane region and a C-terrainal catalytic 

cytoplasmic domain. Some of the receptor PTPases contain fibronectin type III (FN-III) 
repeats, immunoglobulin-like domains, MAM domains or carbonic anhydrase-like domains 
in their extracellular region. The cytoplasmic region generally contains two copies of the 
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PTPAse domain. The first seems to have enzymatic activity, v^^hile the second is inactive but 
seems to affect substrate specificity of the first. In these domains, the catalytic cysteine is 
generally conserved but some other, presumably important, residues are not. 

5 In the following table, the domain structure of known receptor PTPases is shown: 

Extracellular Intracellular 



Ig FN-3 CAH MAM PTPase 

10 



Leukocyte common 


antigen (LCA) (CD45) 0 2 


Leukocyte antigen related (LAR) 3 8 0 0 


Drosophila DLAR 


3 9 0 0 2 


Drosophila DPTP 


2 2 0 0 2 


PTP-alpha (LRP) 


0 0 0 0 2 


PTP-beta 


0 16 0 0 1 


PTP-gamma 


0 110 2 


PTP-delta 


0 >7 0 0 2 


PTP-epsilon 


0 0 0 0 2 


PTP-kappa 


14 0 12 


PTP-mu 


14 0 12 


PTP-zeta 


0 110 2 



PTPase domains consist of about 300 amino acids. There are two conserved cysteines, 
the second one has been shown to be absolutely required for activity. Furthermore, a number 
25 of conserved residues in its immediate vicinity have also been shown to be important. 

A signature pattern was derived for PTPase domains centered on the active site 
cysteine. 

There are three profiles for PTPases, the first one spans the complete domain and is 
not specific to any subtype. The second profile is specific to dual-specificity PTPases and the 
3 0 third one to the PTP subfamily. 



Consensus pattern [LIVMF]-H-C-x(2)-G-x(3)-[STC]-[STAGP]-x-[LIVMFY] [C is the active 
site residue] 
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Notethe M-phase inducer phosphatases (cdc25-type phosphatase) are tyrosine- protein 
phosphatases that are not structurally related to the above PTPases. 

Notethis documentation entry is linked to both a signature pattern and to profiles. As 
profiles are much more sensitive than the pattern, you should use them if you have access to 
the necessary software tools to do so. 

[ 1] Fischer E.H., Charbonneau H., Tonks N.K. Science 253:401-406(1991). 
[ 2] Charbonneau H., Tonks N.K. Annu. Rev. Cell Biol. 8:463-493(1992). 
[ 3] Trowbridge I.S. J. Biol. Chem. 266:23517-23520(1991). 
[ 4] Tonks N.K., Charbonneau H. Trends Biochem. Sci. 14:497-500(1989). 
[ 5] Hunter T. Cell 58:1013-1016(1989). 

780. Connexins signatures 

Gap junctions [1] are specialized regions of the plasma membrane which consist of 
closely packed pairs of transmembrane channels, the connexons, through which small 
molecules diffuse from a cell to a neighboring cell. Each connexon is composed of an 
hexamer of an integral membrane protein which is often referred to as connexin. In a given 
species there are a number of different, yet structurally related, tissue specific, forms of 
connexins. The types of connexins which are currently known are listed below. 



Connexin 56 


(Cx56). 




Connexin 50 


(Cx50) 


(lens fiber protein MP70). 


Connexin 46 


(Cx46) 


(alpha-3). 


Connexin 45 


(Cx45) 


(alpha-6). 


Connexin 43 


(Cx43) 


(alpha-1). 


Connexin 40 


(Cx40) 


(alpha-5). 


Connexin 38 


(Cx38) 


(alpha-2). 


• Connexin 37 


(Cx37) 


(alpha-4). 


■ Connexin 33 


(Cx33) 


(alpha-7). 


- Connexin 32 


(Cx32) 


(beta-1). 


- Connexin 31.1 (Cx31.1) (beta-4). 


- Connexin 31 


(Cx31) 


(beta-3). 


- Connexin 30.3 (Cx30.3) (beta-5). 


- Connexin 26 


(Cx26) 


(beta-2). 
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Structurally the connexins consist of a short cytoplasmic N-terminal domain, followed 
by four transmembrane segments that delimit two extracellular and one cytoplasmic loops; 
the C-terminal domain is cytoplasmic and its length is variable (from 20 residues in Cx26 to 
260 residues in Cx56). The schematic representation of this structure is shown below. 



* * Cytoplasmic 



The sequences of the two extracellular loops are well conserved. In both loops there 
are three conserved cysteines which are involved in disulfide bonds. A signature patterns 
from each of these two loop regions has been built. 

Consensus patternC-[DN]-T-x-Q-P-G-C-x(2)-V-C-[FY]-D [The three C's are involved in 
disulfide bonds] Consensus patternC-x(3,4)-P-C-x(3)-[LIVM]-[DEN]-C-[FY]-[LlVM]-[SA]- 
[KR]-P [The three C's are involved in disulfide bonds] 

[ 1] Goodenough D.A., Goliger J.A., Paul D.L. Annu. Rev. Biochem. 65:475-502(1996). 

781. Gram-positive cocci surface proteins 'anchoring' hexapeptide 

Surface proteins from Gram-positive cocci contains a conserved hexapeptide located a 
few residues downstream of a hydrophobic C-terminal membrane anchor region which is 
followed by a cluster of basic amino acids [1]. This structure is represented in the following 
schematic representation: 



I Variable length extracellular domain |H| Anchor |B| 
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+ +-+ +-+ 

'H': conserved hexapeptide. 
'B': cluster of basic residues. 

It has been proposed that this hexapeptide sequence is responsible for a post-translational 
modification necessary for the proper anchoring of the proteins which bear it, to the cell wall. 
Proteins known to contain such hexapeptide are listed below: 

- Aggregation substance from streptococcus faecalis (asal). 

- C5a peptidase from Streptococcus pyogenes (scpA). 

- C protein alpha-antigen from Streptococcus agalactiae (bca). 

- Cell surface antigen I/II (PAC) from Streptococcus mutans. 

- Dextranase from Streptococcus downei (dex). 

- Fibronectin-binding protein from Staphylococcus aureus (fnbA). 

- Fimbrial subunits from Actinomyces naeslundii and viscosus. 

- IgA binding protein from Streptococcus pyogenes (arp4). 

- IgA binding protein (B antigen) from Streptococcus agalactiae (bag). 

- IgG binding proteins from Streptococci and Staphylococcus aureus. 

- Internalin A from Listeria monocytogenes (inlA). 

- M proteins from streptococci. 

- Muramidase-released protein from Streptococcus suis (mrp). 

- Nisin leader peptide processing protease from Lactococcus lactis (nisP). 

- Protein A from Staphylococcus aureus. 

- Trypsin-resistant surface T protein from streptococci. 

- Wall-associated protein from Streptococcus mutans (wapA). 

- Wall-associated serine proteinases from Lactococcus lactis. 

Consensus patternL-P-x-T-G-[STGAVDE] 

[ 1] Schneewind O., Jones K.F., Fischetti V.A. J. Bacteriol. 172:3310-3317(1990). 

782. Gamma- glutamyltranspeptidase signature 

Gamma-glutamyltranspeptidase (EC 2.3.2.2) (GGT) [1] catalyzes the transfer of the 
gamma-glutamyl moiety of glutathione to an acceptor that may be an amino acid, a peptide or 



Reference No. 2750-942P 



640 

water (forming glutamate). GGT plays a key role in the gamma-glutamyl cycle, a pathway 
for the synthesis and degradation of glutathione. In prokaryotes and eukaryotes, it is an 
enzyme that consists of two polypeptide chains, a heavy and a light subunit, processed from a 
single chain precursor. The active site of GGT is known to be located in the light subunit. 

The sequences of mammalian and bacterial GGT show a number of regions of high 
similarity [2]. Pseudomonas cephalosporin acylases (EC 3.5.1.-) that convert 7-beta-(4- 
carboxybutanamido)-cephalosporanic acid (GL-7ACA) into 7-aminocephalosporanic acid 
(7ACA) and glutaric acid are evolutionary related to GGT and also show some GGT activity 
[3]. Like GGT, these GL-7ACA acylases, are also composed of two subunits. 

One of the conserved regions correspond to the N-terminal extremity of the mature 
light chains of these enzymes. This region has been used as a signature pattern. 

Consensus patternT-[STA]-H-x-[ST]-[LIVMA]-x(4)-G-[SN]-x-V-[STA]-x-T-x-T-[LIVM]- 
[NE]-x(l,2)-[FY]-G 

[ 1] Tate S.S., Meister A. Meth. Enzymol. 113:400-419(1985). 

[ 2] Suzuki H., Kumagai H., Echigo T., Tochikura T. J. Bacteriol. 171:5169-5172(1989). 
[ 3] Ishiye M., Niwa M. Biochim. Biophys. Acta 1132:233-239(1992). 

783. Ferrochelatase signature 

Ferrochelatase (EC 4.99.1.1) (protoheme ferro-lyase) [1,2] catalyzes the last step in 
heme biosynthesis: the chelation of a ferrous ion to proto-porphyrin IX, to form protoheme. 

In eukaryotes, ferrochelatase is a mitochondrial protein bound to the inner membrane, 
whose active site faces the mitochondrial matrix. The mature form of eukaryotic 
ferrochelatase is composed of about 360 amino acids. In bacteria, ferrochelatase (gene hemH) 
[3] is a protein of from 310 to 380 amino acids. 

The human autosomal dominant disease protoporphyria is due to the reduced activity 
of ferrochelatase. 

The signature pattern for this enzyme is based on a conserved region which contains a 
histidine residue which could be involved in binding iron. 



Consensus pattern[LIVMF](2)-x-[ST]-x-H-[GS]-[LIVM]-P-x(4,5)-[DENQKR]-x-G-[DP]- 
x(l,2)-Y 
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[ 1] Labbe-Bois R. J. Biol. Chem. 265:7278-7283(1990). 

[ 2] Brenner D.A., Frasier F. Proc. Natl. Acad. Sci. U.S.A. 88:849-853(1991). 

[ 3] Miyamoto K., Nakahigashi K., Nishimura K., Inokuchi H. J. Mol. Biol. 219:393- 

398(1991). 

784. Cellulose-binding domain, bacterial type 

The microbial degradation of cellulose and xylans requires several types of enzyme 
such as endoglucanases (EC 3.2.1.4), cellobiohydrolases (EC 3.2.1.91) (exoglucanases), or 
xylanases (EC 3.2.1.8) [1]. 

Structurally, cellulases and xylanases generally consist of a catalytic domain joined 
to a cellulose-binding domain (CBD) by a short linker sequence rich in proline and/or 
hydroxy-amino acids. 

The CBD of a number of bacterial cellulases has been shown to consist of about 105 
amino acid residues [2]. Enzymes known to contain such a domain are: 

- Endoglucanase (gene endl) from Butyrivibrio fibrisolvens. 

- Endoglucanases A (gene cenA) and B (cenB) from Cellulomonas fimi. 

- Exoglucanases A (gene cbhA) and B (cbhB) from Cellulomonas fimi. 

- Endoglucanase E-2 (gene celB) from Thermomonospora fusca. 

- Endoglucanase A (gene celA) from Microbispora bispora. 

- Endoglucanases A (gene celA), B (celB) and C (celC) from Pseudomonas fluorescens. 

- Endoglucanase A (gene celA) from Streptomyces lividans. 

- Exocellobiohydrolase (gene cex) from Cellulomonas fimi. 

- Xylanases A (gene xynA) and B (xynB) from Pseudomonas fluorescens. 

- Arabinofuranosidase C (EC 3.2.1.55) (xylanase C) (gene xynC) from Pseudomonas 
fluorescens. 

- Chitinase 63 (EC 3.2.1.14) from Streptomyces plicatus. 

- Chitinase C from Streptomyces lividans. 

The CBD domain is found either at the N-terminal or at the C-terminal extremity of these 
enzymes. As it is shown in the following schematic representation, there are two conserved 
cysteines in this CBD domain - one at each extremity of the domain - which have been shown 
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[3] to be involved in a disulfide bond. There are also four conserved tryptophan residues 
which could be involved in the interaction of the CBD with polysaccharides. 



xCxxxxWxxxxxNxxxWxxxxxxxWxxxxxxxxWNxxxxxGxxxxxxxxxxCx 

'C: conserved cysteine involved in a disulfide bond. position of the pattern. 

Consensus patternW-N-[STAGR]-[STDN]-[LIVM]-x(2)-[GST]-x-[GST]-x(2)- [LIVMFT]- 
[GA] 

[ 1] Gilkes N.R., Henrissat B., Kilburn D.G., Miller R.C. Jr., Warren R.A.J. Microbiol. Rev. 
55:303-315(1991). 

[ 2] Meinke A., Gilkes N.R., Kilburn D.G., Miller R.C. Jr., Warren R.A.J. Protein Seq. Data 
Anal. 4:349-353(1991). 

[ 3] Gilkes N.R., Claeyssens M., Aebersold R., Henrissat B., Meinke A., Morrison H.D., 
Kilburn D.G., Warren R.A.J., Miller R.C. Jr. Eur. J. Biochem. 202:367-377(1991). 

785. Amidases signature 

It has been shown [1,2,3] that several enzymes from various prokaryotic and 
eukaryotic organisms which are involved in the hydrolysis of amides (amidases) are 
evolutionary related. These enzymes are listed below. 

- Indoleacetamide hydrolase (EC 3.5.1.-), a bacterial plasmid-encoded enzyme that catalyzes 
the hydrolysis of indole-3-acetamide (lAM) into indole-3 -acetate (lAA), the second step in 
the biosynthesis of auxins from tryptophan. 

- Acetamidase from Emericella nidulans (gene amdS), an enzyme which allows acetamide to 
be used as a sole carbon or nitrogen source. 

- Amidase (EC 3.5.1.4) from Rhodococcus sp. N-774 and Brevibacterium sp. R312 (gene 
amdA). This enzyme hydrolyzes propionamides efficiently, and also at a lower efficiency, 
acetamide, acrylamide and indoleacetamide. 

- Amidase (EC 3.5.1.4) from Pseudomonas chlororaphis. 
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- 6-aminohexanoate-cyclic-dimer hydrolase (EC 3.5.2.12) (nylon oligomers degrading 
enzyme El) (gene nylA), a bacterial plasmid encoded enzyme which catalyzes the first step 
in the degradation of 6-aminohexanoic acid cyclic dimer, a by-product of nylon manufacture 
[4]. 

- Glutamyl-tRNA(Gln) amidotransferase subunit A [5]. 

- Mammalian fatty acid amide hydrolase (gene FAAH) [6]. 

- A putative amidase from yeast (gene AMD2). 

- Mycobacterium tuberculosis putative amidases amiA2, amiB2, amiC and amiD. 

All these enzymes contain in their central section a highly conserved region rich in glycine, 
serine, and alanine residues. This region has been used as a signature pattern. 

Consensus pattern: G-[GA]-S-[GS]-[GS]-G-x-[GSA]-[GSAVY]-x-[LIVM]-[GSA]-x(6)- 
[GSAT]-x-[GA]-x-[DE]-x-[GA]-x-S-[LIVM]-R-x-P-[GSAC] 

[ 1] Mayaux J.-F., Cerbelaud E., Soubrier F., Faucher D., Petre D. J. Bacterid. 172:6764- 
6773(1990). 

[ 2] Hashimoto Y., Nishiyama M., Ikehata O., Horinouchi S., Beppu T. Biochim. Biophys. 
Acta 1088:225-233(1991). 

[ 3] Chang T.-H., Abelson J. Nucleic Acids Res. 18:7180-7180(1990). 

[ 4] Tsuchiya K., Fukuyama S., Kanzaki N., Kanagavv^a K., Negoro S., Okada H. J. Bacterid. 
171:3187-3191(1989). 

[ 5] Curnow A.W., Hong K.W., Yuan R., Kim S.L, Martins O., Winkler W., Henkin T.M., 
Soil D. Proc. Natl. Acad. Sci. U.S.A. 94:11819-11826(1997). 

[ 6] Cravatt B.F., Giang D.K., Mayfield S.P., Boger D.L., Lerner R.A., Gilula N.B. Nature 
384:83-87(1996). 

786. Glycosyl hydrolases family 10 active site 

The microbial degradation of cellulose and xylans requires several types of enzymes 
such as endoglucanases (EC 3.2.1.4), cellobiohydrolases (EC 3.2.1.91) (exoglucanases), or 
xylanases (EC 3.2.1.8) [1,2]. Fungi and bacteria produces a spectrum of cellulolytic enzymes 
(cellulases) and xylanases which, on the basis of sequence similarities, can be classified into 
families. One of these families is known as the cellulase family F [3] or as the glycosyl 
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hydrolases family 10 [4,E1]. The enzymes which are currently known to belong to this 
family are listed below. 

- Aspergillus awamori xylanase A (xynA). 

- Bacillus sp. strain 125 xylanase (xynA). 

- Bacillus stearothermophilus xylanase. 

- Butyrivibrio fibrisolvens xylanases A (xynA) and B (xynB). 

- Caldocellum saccharolyticum bifunctional endoglucanase/exoglucanase (celB). This 
protein consists of two domains; it is the N-terminal domain, which has exoglucanase 
activity, which belongs to this family. 

- Caldocellum saccharolyticum xylanase A (xynA). 

- Caldocellum saccharolyticum ORF4. This hypothetical protein is encoded in the xynABC 
operon and is probably a xylanase. 

- Cellulomonas fimi exoglucanase/xylanase (cex). 

- Clostridium stercorarium thermostable celloxylanase. 

- Clostridium thermocellum xylanases Y (xynY) and Z (xynZ). 

- Cryptococcus albidus xylanase. 

- Penicillium chrysogenum xylanase (gene xylP). 

- Pseudomonas fluorescens xylanases A (xynA) and B (xynB). 

- Ruminococcus flavefaciens bifunctional xylanase XYLA (xynA). This protein consists of 
three domains: a N-terminal xylanase catalytic domain that belongs to family 11 of glycosyl 
hydrolases; a central domain composed of short repeats of Gin, Asn an Trp, and a C-terminal 
xylanase catalytic domain that belongs to family 10 of glycosyl hydrolases. 

- Streptomyces lividans xylanase A (xlnA). 

- Thermoanaerobacter saccharolyticum endoxylanase A (xynA). 

- Thermoascus aurantiacus xylanase. 

- Thermophilic bacterium Rt8.B4 xylanase (xynA). 

One of the conserved regions in these enzymes is centered on a conserved glutamic acid 
residue which has been shown [5], in the exoglucanase from Cellulomonas fimi, to be 
directly involved in glycosidic bond cleavage by acting as a nucleophile. This region has 
been used as a signature pattern. 
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Consensus pattern[GTA]-x(2)-[LIVN]-x-[IVMF]-[ST]-E-[LIY]-[DN]-[LIVMF] [E is the 
active site residue] 

[ 1] Beguin P. Annu. Rev. Microbiol. 44:219-248(1990). 

[ 2] Gilkes N.R., Henrissat B., Kilburn D.G., Miller R.C. Jr., Warren R.A.J. Microbiol. Rev. 
55:303-315(1991). 

[ 3] Henrissat B., Claeyssens M., Tomme P., Lemesle L., Mornon J. -P. Gene 81:83-95(1989). 
[ 4] Henrissat B. Biochem. J. 280:309-316(1991). 

[ 5] Tull D., Withers S.G., Gilkes N.R., Kilburn D.G., Warren R.A.J., Aebersold R. J. Biol. 
Chem. 266:15621-15625(1991). 

787. Fructose-bisphosphate aldolase class-II signatures 

Fructose-bisphosphate aldolase (EC 4.1.2.13) [1,2] is a glycolytic enzyme that 
catalyzes the reversible aldol cleavage or condensation of fructose-1,6- bisphosphate into 
dihydroxyacetone-phosphate and glyceraldehyde 3-phosphate. There are two classes of 
fructose-bisphosphate aldolases with different catalytic mechanisms. Class-II aldolases [2], 
mainly found in prokaryotes and fungi, are homodimeric enzymes which require a divalent 
metal ion — generally zinc - for their activity. 

This family also includes the following proteins: 

- Escherichia coli galactitol operon protein gatY which catalyzes the transformation of 
tagatose 1,6-bisphosphate into glycerone phosphate and D- glyceraldehyde 3-phosphate. 

- Escherichia coli N-acetyl galactosamine operon protein agaY which catalyzes the same 
reaction as that of gatY. 

As signature patterns for this class of enzyme, two conserved regions were selected. The first 
pattern is located in the first half of the sequence and contains two histidine residues that have 
been shown [4] to be involved in binding a zinc ion. The second is located in the C-terminal 
section and contains clustered acidic residues and glycines. 

Consensus pattern[FYVMT]-x(l,3)-[LIVMH]-[APN]-[LIVM]-x(l,2)-[LIVM]-H-x-D-H- 
[GACH] [The two H's are zinc ligands] 

Consensus pattern[LIVM]-E-x-E-[LIVM]-G-x(2)-[GM]-[GSTA]-x-E 
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[ 1] Perham R.N. Biochem. Soc. Trans. 18:185-187(1990). 

[ 2] Marsh J.J., Lebherz H.G. Trends Biochem. Sci. 17:110-113(1992). 

[ 3] von der Osten C.H., Barbas C.F. Ill, Wong C.-H., Sinskey A.J. Mol. Microbiol. 3:1625- 

1637(1989). 

[ 4] Berry A., Marshall K.E. FEES Lett. 318:11-16(1993). 

788. Prolyl oligopeptidase family serine active site 

The prolyl oligopeptidase family [1,2,3] consist of a number of evolutionary related 
peptidases whose catalytic activity seems to be provided by a charge relay system similar to 
that of the trypsin family of serine proteases, but which evolved by independent convergent 
evolution. The known members of this family are listed below. 

- Prolyl endopeptidase (EC 3.4.21.26) (PE) (also called post-proline cleaving enzyme). PE is 
an enzyme that cleaves peptide bonds on the C-terminal side of prolyl residues. The sequence 
of PE has been obtained from a mammalian species (pig) and from bacteria (Flavobacterium 
meningosepticum and Aeromonas hydrophila); there is a high degree of sequence 
conservation between these sequences. 

- Escherichia coli protease II (EC 3.4.21.83) (oligopeptidase B) (gene prtB) which cleaves 
peptide bonds on the C-terminal side of lysyl and argininyl residues. 

- Dipeptidyl peptidase IV (EC 3.4.14.5) (DPP IV). DPP IV is an enzyme that removes N- 
terminal dipeptides sequentially from polypeptides having unsubstituted N-termini provided 
that the penultimate residue is proline. 

- Yeast vacuolar dipeptidyl aminopeptidase A (DPAP A) (gene: STE13) which is responsible 
for the proteolytic maturation of the alpha-factor precursor. 

- Yeast vacuolar dipeptidyl aminopeptidase B (DPAP B) (gene: DAP2). 

- Acylamino-acid-releasing enzyme (EC 3.4.19.1) (acyl-peptide hydrolase). This enzyme 
catalyzes the hydrolysis of the amino-terminal peptide bond of an N-acetylated protein to 
generate a N-acetylated amino acid and a protein with a free amino-terminus. 

A conserved serine residue has experimentally been shown (in E.coli protease II as well as in 
pig and bacterial PE) to be necessary for the catalytic mechanism. This serine, which is part 
of the catalytic triad (Ser, His, Asp), is generally located about 150 residues away from the C- 
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terminal extremity of these enzymes (which are all proteins that contains about 700 to 800 
amino acids). 

Consensus patternD-x(3)-A-x(3)-[LIVMFYW]-x(14)-G-x-S-x-G-G-[LIVMFYW](2) [S is the 
active site residue] 

Note these proteins belong to families S9A/S9B/S9C in the classification of peptidases 
[4,E1]. 

[ 1] Rawlings N.D., Polgar L., Barrett A.J. Biochem. J. 279:907-911(1991). 
[ 2] Barrett A.J., Rawlings N.D. Biol. Chem. Hoppe-Seyler 373:353-360(1992). 
[ 3] Polgar L., Szabo E. 

Biol. Chem. Hoppe-Seyler 373:361-366(1992). 

[ 4] Rawlings N.D., Barrett A.J. Meth. Enzymol. 244:19-61(1994). 

789. Formate-tetrahydrofolate ligase signatures 

Formate-tetrahydrofolate ligase (EC 6.3.4.3) (formyltetrahydrofolate synthetase) 
(FTHFS) is one of the enzymes participating in the transfer of one-carbon units, an essential 
element of various biosynthetic pathways. In many of these processes the transfers of one- 
carbon units are mediated by the coenzyme tetrahydrofolate (THF). Various reactions 
generate one-carbon derivatives of THF which can be interconverted between different 
oxidation states by FTHFS, methylenetetrahydrofolate dehydrogenase (EC 1.5.1.5) and 
methenyltetrahydrofolate cyclohydrolase (EC 3.5.4.9). 

In eukaryotes the FTHFS activity is expressed by a multifunctional enzyme, C-1- 
tetrahydrofolate synthase (Cl-THF synthase), which also catalyzes the dehydrogenase and 
cyclohydrolase activities. Two forms of Cl-THF synthases are known [1], one is located in 
the mitochondrial matrix, while the second one is cytoplasmic. In both forms the FTHFS 
domain consist of about 600 amino acid residues and is located in the C-terminal section of 
Cl-THF synthase. In prokaryotes FTHFS activity is expressed by a monofunctional 
homotetrameric enzyme of about 560 amino acid residues [2]. 

The sequence of FTHFS is highly conserved in all forms of the enzyme. As signature 
patterns, two regions that are almost perfectly conserved were selected. The first one is a 
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glycine-rich segment located in the N-terminal part of FTHFS and which could be part of an 
ATP -binding domain [2]. The second pattern is located in the central section of FTHFS. 

Consensus patternG-[LIVM]-K-G-G-A-A-G-G-G-Y 
Consensus patternV-A-T-[IV]-R-A-L-K-x-[HN]-G-G 

[ 1] Shannon K.W., Rabinowitz J.C. J. Biol. Chem. 263:7717-7725(1988). 

[ 2] Lovell C.R., Przybyla A., Ljungdahl L.G. Biochemistry 29:5687-5694(1990). 

790. Transthyretin signatures 

Transthyretin (prealbumin) [1] is a thyroid hormone-binding protein that seems to 
transport thyroxine (T4) from the bloodstream to the brain. It is a protein of about 130 amino 
acids that assembles as a homotetramer and forms an internal channel that binds thyroxine. 
Transthyretin is mainly synthesized in the brain choroid plexus. In humans, variants of the 
protein are associated with distinct forms of amyloidosis. 

The sequence of transthyretin is highly conserved in vertebrates. A number of 
uncharacterized proteins also belong to this family: 

- Escherichia coli hypothetical protein yedX. 

- Bacillus subtilis hypothetical protein yunM. 

- Caenorhabditis elegans hypothetical protein R09H10.3. 

- Caenorhabditis elegans hypothetical protein ZK697.8. 

Two regions were selected as signature patterns. The first located in the N-terminal extremity 
starts with a lysine known to be involved in binding T4. The second pattern is located in the 
C-terminal extremity. 

Consensus pattern[KH]-[IV]-L-[DN]-x(3)-G-x-P-A-x(2)-[IV]-x-[IV] [The K binds thyroxine] 
Consensus patternY-[TH]-[IV]-[AP]-x(2)-L-S-[PQ]-[FYW]-[GS]-[FY]-[QS] 

[ 1] Schreiber G., Richardson S.J. Comp. Biochem. Physiol. 116B: 137-160(1997). 



791. Dihydropteroate synthase signatures 
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All organisms require reduced folate cofactors for the synthesis of a variety of 
metabolites. Most microorganisms must synthesize folate de novo because they lack the 
active transport system of higher vertebrate cells which allows these organisms to use dietary 
folates. Enzymes that are involved in the biosynthesis of folates are therefore the target of a 
variety of antimicrobial agents such as trimethoprim or sulfonamides. 

Dihydropteroate synthase (EC 2.5.1.15) (DHPS) catalyzes the condensation of 6- 
hydroxymethyl-7,8-dihydropteridine pyrophosphate to para-aminobenzoic acid to form 7,8- 
dihydropteroate. This is the second step in the three steps pathway leading from 6- 
hydroxymethyl-7,8-dihydropterin to 7,8-dihydrofolate. DHPS is the target of sulfonamides 
which are substrates analog that compete with para-aminobenzoic acid. 

Bacterial DHPS (gene sul or folP) [1] is a protein of about 275 to 315 amino acid 
residues which is either chromosomally encoded or found on various antibiotic resistance 
plasmids. In the lower eukaryote Pneumocystis carinii, DHPS is the C-terminal domain of a 
multifunctional folate synthesis enzyme (gene fas) [2]. 

Two signature patterns for DHPS were developed, the first signature is located in the 
N-terminal section of these enzymes, while the second signature is located in the central 
section. 

Consensus pattern[LIVM]-x-[AG]-[LIVMF](2)-N-x-T-x-D-S-F-x-D-x-[SG] 
Consensus pattern[GE]-[SA]-x-[LIVM](2)-D-[LIVM]~G-[GP]-x(2)-[STA]-x-P 

[ 1] Slock J., Stahly D.P., Han C.-Y., Six E.W., Crawford LP. J. Bacteriol. 172:7211- 
7226(1990). 

[ 2] Volpes F., Dyer M., Scaife J.G., Darby G., Stammers D.K., Delves C.J. Gene 112:213- 
218(1992). 

792. Phosphatidylinositol 3- and 4-kinases signatures 

Phosphatidylinositol 3-kinase (PI3-kinase) (EC 2.7.1.137) [1] is an enzyme that 
phosphorylates phosphoinositides on the 3-hydroxyl group of the inositol ring. The exact 
function of the three products of PI3-kinase - PI-3-P, PI-3,4-P(2) and PI-3,4,5-P(3) - is not 
yet known, although it is proposed that they function as second messengers in cell signalling. 
Currently, three forms of PI3-kinase are known: 
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- The mammalian enzyme which is a heterodimer of a 110 Kd catalytic chain (pllO) and an 
85 Kd subunit (p85) which allows it to bind to activated tyrosine protein kinases. There are at 
least two different types of plOO subunits (alpha and beta). 

- Yeast TORl/DRRl and TOR2/DRR2 [2], PI3-kinases required for cell cycle activation. 
Both are proteins of about 280 Kd. 

- Yeast VPS34 [3], a PI3-kinase involved in vacuolar sorting and segregation. VPS34 is a 
protein of about 100 Kd. 

- Aiabidopsis thaliana and soybean VPS34 homologs. 

Phosphatidylinositol 4-kinase (PI4-kinase) (EC 2.7.1.67) [4] is an enzyme that acts on 
phosphatidylinositol (PI) in the first committed step in the production of the second 
messenger inositol-l,4,5,-trisphosphate. Currently the following forms of PI4-kinases are 
known: 

- Human PI4-kinase alpha. 

- Yeast PIKl, a nuclear protein of 120 Kd. 

- Yeast STT4, a protein of 214 Kd. 

The PI3- and PI4-kinases share a well conserved domain at their C-terminal section; this 
domain seems to be distantly related to the catalytic domain of protein kinases [2]. Two 
signature patterns were developed from the best conserved parts of this domain. 

Four additional proteins belong to this family: 

- Mammalian FKBP-rapamycin associated protein (FRAP) [5], which acts as the target for 
the cell-cycle arrest and immunosuppressive effects of the FKBP12-rapamycin complex. 

- Yeast protein ESRl [6] which is required for cell growth, DNA repair and meiotic 
recombination. 

- Yeast protein TELl which is involved in controlling telomere length. 

- Yeast hypothetical protein YHR099w, a distantly related member of this family. 

- Fission yeast hypothetical protein SpAC22E12.16C. 

Consensus pattern[LIVMFAC]-K-x(l,3)-[DEA]-[DE]-[LIVMq-R-Q-[DE]-x(4)-Q 
Consensus pattern[GS]-x-[AV]-x(3)-[LIVM]-x(2)-[FYH]-[LIVM](2)-x-[LIVMF]-x-D-R-H- 
x(2)-N 
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[ 1] Hiles I.D., Otsu M., Volinia S., Fry M.J., Gout I., Dhand R., Panayotou G., Ruiz-Larrea 
F., Thompson A., Totty N.F., Hsuan J.J., Courtneidge S.A., Parker P.J., Waterfield M.D. Cell 
70:419-429(1992). 

[ 2] Kunz J., Henriquez R., Schneider U., Deuter-Reinhard M., Movva N., Hall M.N. Cell 
73:585-596(1993). 

[ 3] Schu P.V., Takegawa K., Fry M.J., Stack J.H., Waterfield M.D., Emr S.D. Science 
260:88-91(1993). 

[ 4] Garcia-Bustos J.F., Marini F., Stevenson 1., Frei C, Hall M.N. EMBO J. 13:2352- 
2361(1994). 

[ 5] Brown E.J., Albers M.W., Shin T.B., Ichikawa K., Keith C.T., Lane W.S., Schreiber S.L. 
Nature 369:756-758(1994). 

[ 6] Kato R., Ogawa H. Nucleic Acids Res. 22:3104-3112(1994). 

793. FAD-dependent glycerol-3-phosphate dehydrogenase signatures 

FAD-dependent glycerol -3 -phosphate dehydrogenase (EC 1.1.99.5) (GPD) catalyzes 
the conversion of glycerol-3-phosphate into dihydroxyacetone phosphate. In bacteria [1] it is 
associated with the utilization of glycerol coupled to respiration. In Escherichia coli, two 
isozymes are known: one expressed under anaerobic conditions (gene glpA) and one in 
aerobic conditions (gene glpD). In eukaryotes, a mitochondrial form of GPD participates in 
the glycerol phosphate shuttle in conjunction with an NAD-dependent cytoplasmic GPD (EC 
1.1.1.8) [2,3]. 

These enzymes are proteins of about 60 to 70 Kd which contain a probable FAD- 
binding domain in their N-terminal extremity. The mammalian enzyme differs from the 
bacterial or yeast proteins by having an EF-hand calcium-binding region (See 
<PDOC00018>) in its C-terminal extremity. 

Two signature patterns were developed. One based on the first half of the FAD- 
binding domain and one which corresponds to a conserved region in the central part of these 
enzymes. 

Consensus pattern[r/]-G-G-G-x(2)-G-[STACV]-G-x-A-x-D-x(3)-R-G 
Consensus patternG-G-K-x(2)-[GSTE]-Y-R-x(2)-A 
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[ 1] Austin D., Larson T.J. J. BacterioL 173:101-107(1991). 

[ 2] Roennow B., Kielland-Brandt M.C. Yeast 9:1121-1130(1993). 

[ 3] Brown L.J., McDonald M.J., Lehn D.A., Moran S.M. J. Biol. Chem. 269:14363- 

14366(1994). 

794. N0Ll/N0P2/sun family signature 

The following proteins seems to be evolutionary related: 

- Mammalian proliferating-cell nucleolar antigen pl20 (gene NOLI) which may play a role 
in the regulation of the cell cycle and the increased nucleolar activity that is associated with 
the cell proliferation. 

- Yeast nucleolar protein N0P2 (or YNAl) which could be involved in nucleolar function 
during the onset of growth, and in the maintenance of nucleolar structure. 

- Yeast hypothetical protein YBL024w. 

- Bacterial protein sun (also known as fmu). 

- Escherichia coli hypothetical protein yebU. 

- Mycobacterium tuberculosis hypothetical protein MtCY21B4.24. 

- Methanococcus jannaschii hypothetical protein MJ0026. 

NOLI is a protein of 855 residues, NOP2 consists of 618 residues, YBL024w of 684, sun is a 
protein of about 430 to 450 residues and MJ026 has 274 residues. They share a conserved 
central domain which contains some highly conserved regions. One of these regions was 
selected as a signature pattern. 

Consensus pattern[FV]-D-[KRA]-[LIVMA]-L-x-D-[AV]-P-C-[ST]-[GA] 

795. moaA / nifB / pqqE family signature 

A number of proteins involved in the biosynthesis of metallo cofactors have been 
shown [1,2] to be evolutionary related. These proteins are: 

- Bacterial and archebacterial protein moaA, which is involved in the biosynthesis of the 
molybdenum cofactor (molybdopterin; MPT). 

- Arabidopsis thaliana cnx2, a protein involved in molybdopterin biosynthesis and which is 
highlys similar to moaA. 

- Bacillus subtilis narA, which seems to be the moaA ortholog in that bacteria. 
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- Bacterial protein nifB (or fixZ) which is involved in the biosynthesis of the nitrogenase 
iron-molybdenum cofactor. 

- Bacterial protein pqqE which is involved in the biosynthesis of the cofactor pyrrolo- 
quinoline-quinone (PQQ). 

- Pyrococcus furiosus cmo, a protein involved in the synthesis of a molybdopterin-based 
tungsten cofactor. 

- Caenorhabditis elegans hypothetical protein F49E2.1. 

All these proteins share, in their N-terminal region, a conserved domain that contains three 
cysteines. In moaA, these cysteines have been shown [1] to be important for the biological 
activity. They could be inolved in the binding of an iron-sulfur cluster. 

Consensus pattern[LIV]-x(3)-C-[NP]-[LIVMF]-[QRS]-C-x-[FYM]-C [The three Cs are 
putative Fe-S ligands 

[ 1] Menendez C, Igloi G., Henninger H., Brandsch R. Arch. Microbiol. 164:142-151(1995). 
[ 2] Hoff T., Schnorr K.M., Meyer C, Caboche M. J. Biol. Chem. 270:6100-6107(1995). 

796. Forkhead-associated (FHA) domain profile 

The forkhead-associated (FHA) domain [1,E1] is a putative nuclear signalling domain 
found in a variety of otherwise unrelated proteins. The FHA domain comprise approximately 
55 to 75 amino acids and contains three highly conserved blocks separated by divergent 
spacer regions. Currently it has been found in the following proteins: 

- Four transcription factors that also contain a forkhead (FH) domain: mouse myocyte 
nuclear factor 1 (MNFl), yeast transcription factor FHLl, which probably controls pre- 
mRNA processing, and yeast FKHl and FKH2. In those protein the FHA domain is located 
N-terminal of the DNA-binding FH domain. 

- Kinase-associated protein phosphatase (KAPP) from Arabidopsis thaliana, a protein which 
specifically interacts with the receptor-type Ser/Thr-kinase RLK5. In KAPP, the FHA 
domain maps to a region that interacts with the receptor-type protein kinase RLK5 only if the 
kinase is phosphorylated on serine residues [2]. 
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- Two protein kinases from yeast that are involved in mediating the nuclear response to DNA 
damage: DUNl and SPKl/SADl [3]. The latter is the only known protein containing two 
copies of the FHA domain. 

- Protein kinase cdsl from fission yeast contains a FHA domain and might be the ortholog of 
SPKl. 

- Protein kinase MEKl from yeast, which is involved in meiotic recombination. 

- Human nuclear antigen Ki67 which is expressed only in proliferating cells. 

- Yeast hypothetical protein YHRllSc, which contains a RING-finger C-terminal of the 
FHA domain. 

- Yeast hypothetical proteins L8083.1 and 9346.10, which contain an extensive coiled-coil 
region C-terminal of the FHA domain. 

- Caenorhabditis elegans hypothetical protein ZK632.2. 

- Caenorhabditis elegans hypothetical protein C01G6.5. 

- FraH from the prokaryote Anabaena, which contains a zinc-finger motif N-terminal of the 
FHA domain. 

- An ORF from the bacterium Streptomyces, which is on the opposite strand of the protein 
kinase pksl, overlapping the ORF of the kinase. 

[ 1] Hofmann K.O., Bucher P. Trends Biochem. Sci. 20:347-349(1995). 

[ 2] Stone J.M., Collinge M.A., Smith R.D., Horn M.A., Walker J.C. Science 266:793- 

795(1994). 

[ 3] Navas T.A., Zhou Z., EUedge S.J. Cell 80:29-39(1995). 
797. Ald_Xan_dh_C 

Aldehyde oxidase and xanthine dehydrogenase, C terminus 

[1] Romao MJ, Archer M, Moura I, Moura JJ, LeGall J, Engh R, Schneider M, Hof P, Huber 
R; Medline: 96072968 "Crystal structure of the xanthine oxidase-related aldehyde oxido- 
reductase from D. gigas." Science 1995;270:1170-1176. 

Number of members: 54 



798. Glyco_hydro_38 
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Glycosyl hydrolases family 38 

Glycosyl hydrolases are key enzymes of carbohydrate metabolism. 

Number of members: 20 

[1] Henrissat B; Medline: 98313424; "Glycosidase families" Biochem Soc Trans 
1998;26:153-156. 

799. HECT 

HECT-domain (ubiquitin-transferase). 

The name HECT comes from Homologous to the E6-AP Carboxyl 
Terminus. 

Number of members: 43 

[1] Huibregtse JM, Scheffner M, Beaudenon S, Howley PM; Medline: 95223981; "A family 
of proteins structurally and functionally related to the E6-AP ubiquitin-protein ligase." Proc 
Natl Acad Sci U S A 1995;92:2563-2567. 

800. HRDC 
HRDC domain 

The HRDC (Helicase and RNase D C-terminal) domain has a putative role in nucleic 
acid binding. Mutations in the HRDC domain cause human disease. 

Number of members: 19 

[1] Morozov V, Mushegian AR, Koonin EV, Bork F; Medline: 98060076; "A putative 
nucleic acid-binding domain in Bloom's and Werner's syndrome helicases" Trends Biochem 
Sci 1997;22:417-418. 

801. Integrase 

Integrase mediates integration of a DNA copy of the viral genome into the host 
chromosome. Integrase is composed of three domains. The amino-terminal domain is a zinc 



Reference No. 



2750-942P 



656 

binding domain. The central domain is the catalytic domain [l].The carboxyl terminal 
domain is a DNA binding domain [2]. 

Number of members: 581 

[1] Dyda F, Hickman AB, Jenkins TM, Engelman A, Craigie R, Davies DR; Medline: 
95099322. "Crystal structure of the catalytic domain of HIV-1 integrase: similarity to other 
polynucleotidyl transferases." Science 1994;266; 1981-1986. 

[2] Lodi PJ, Ernst JA, Kuszewski J, Hickman AB, Engelman A, Craigie R, Clore GM, 
Gronenborn AM; Medline: 95359147: "Solution structure of the DNA binding domain of 
HIV-1 integrase." Biochemistry 1995;34:9826-9833 

802. lig_chan 
Ligand-gated ion channel 

This family includes the four transmembrane regions of the ionotropic glutamate 
receptors and NMDA receptors. 

Number of members: 128 

[1] Tong G, Shepherd D, Jahr CE; Medline: 95184014; "Synaptic desensitization of NMDA 
receptors by calcineurin." Science 1995;267:1510-1512. 

803. RhoGAP 
RhoGAP domain 

GTPase activator proteins towards Rho/Rac/Cdc42-like small GTPases. 
Number of members: 97 

[1] Musacchio A, Cantley LC, Harrison SC; Medline: 97121392; "Crystal structure of the 
breakpoint cluster region-homology domain from phosphoinositide 3-kinase p85 alpha 
subunit." Proc Natl Acad Sci U S A 1996;93:14373-14378. 
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[2] Barrett T, Xiao B, Dodson EJ, Dodson G, Ludbrook SB, Nurmahomed K, Gamblin SJ, 
Musacchio A, Smerdon SJ, Eccleston JF; Medline: 97162209; "The structure of the GTPase- 
activating domain from pSOrhoGAP." Nature 1997;385:458-461. 

[3] Rittinger K, Walker PA, Eccleston JF, Nurmahomed K, Owen D, Laue E, Gamblin SJ, 
Smerdon SJ; Medline: 97404320; "Crystal structure of a small G protein in complex with the 
GTPase-activating protein rhoGAP." Nature 1997;388:693-697. 
[4] Boguski MS, McCormick F; Medline: 94081948; "Proteins regulating Ras and its 
relatives." Nature 1993;366:643-654. 

804. vwd 

von Willebrand factor type D domain 

[1] Bork P; Medline: 93327926; "The modular architecture of a new family of growth 
regulators related to connective tissue growth factor." FEBS lett 1993;327:125-130. 

Number of members: 92 

805. zf-C4_Topoisom 
Topoisomerase DNA binding C4 zinc finger 

[1] Tse-Dinh YC, Beran-Steed RK; Medline: 89034032; "Escherichia coli DNA 
topoisomerase I is a zinc 

metalloprotein with three repetitive zinc-binding domains." J Biol Chem 1988;263: 15857- 
15859. 

[2] Ahumada A, Tse-Dinh YC; Medline: 99011409; "The Zn(II) binding motifs of E. coli 
DNA topoisomerase I is part of a high-affinity DNA binding domain." Biochem Biophys Res 
Commun 1998;251:509-514. 

Number of members : 5 1 

806. AIRC 
AIR carboxylase 
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Members of this family catalyse the decarboxylation of l-(5-phosphoribosyl)-5-amino-4- 
imidazole-carboxylate (AIR). This family catalyse the sixth step of de novo purine 
biosynthesis. Some members of this family contain two copies of this 
domain. Number of members: 35 

807. Bromodomain signature and profile 

PROSITE cross-reference(s): PS00633; BROMODOMAIN_l, PS50014; 
BROMODOMAIN_2 

The bromodomain [1,2,3] is a conserved region of about 70 amino acids found in the 
following proteins: 

- Higher eukaryotes transcription initiation factor TFIID 250 Kd subunit (TBP-associated 
factor p250) (gene CCGl). P250 associated with the TFIID TATA-box binding protein and 
seems essential for progression of the Gl phase of the cell cycle. 

- Human RING3, a protein of unknown function encoded in the MHC class II locus. 

- Mammalian CREB-binding protein (CBP), which mediates cAMP-gene regulation by 
binding specifically to phosphorylated CREB protein. 

- Drosophila female sterile homeotic protein (gene fsh), required maternally for proper 
expression of other homeotic genes involved in pattern formation, such as Ubx. 

- Drosophila brahma protein (gene brm), a protein required for the activation of multiple 
homeotic genes. 

- Mammalian homologs of brahma. In human, three brahma-like proteins are known: 
SNF2a(hBRM), SNF2b, and BRGl. 

- Human BS69, a protein that binds to adenovirus ElA and inhibits ElA transactivation 

- Human peregrin (or Br 140). 

- Yeast BDFl [3], a transcription factor involved in the expression of a broad class of genes 
including snRNAs. 

- Yeast GCN5, a general transcriptional activator operating in concert with certain other 
DNA-binding transcriptional activators, such as GCN4, HAP2/3/4 or ADA2. 

- Yeast NPSl/STHl, involved in G(2) phase control in mitosis. 

- Yeast SNF2/SWI2, which is part of a complex with the SNF5, SNF6, SWI3 and 
ADR6/SWI1 proteins. This SWI-complex is involved in transcriptional activation. 

- Yeast SPT7, a transcriptional activator of Ty elements and possibly other genes. 
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- Caenorhabditis elegans protein cbp-1. 

- Yeast hypothetical protein YGR056w. 

- Yeast hypothetical protein YKROOSw. 

- Yeast hypothetical protein L9638.1. 

Some proteins contain a region which, while similar to some extent to a classical 
bromodomain, diverges from it by either lacking part of the domain or because of an 
insertion. These proteins are: 

- Mammalian protein HRX (also known as All-1 or MLL), a protein involved in 
translocations leading to acute leukemias and which possibly acts as a transcriptional 
regulatory factor. HRX contains a region similar to the C- terminal half of the bromodomain. 

- Caenorhabditis elegans hypothetical protein ZK783.4. The bromodomain of this protein has 
a 23 amino-acid insertion. 

- Yeast protein YTA7. This protein contains a region with significant similarity to the C- 
terminal half of the bromodomain. As it is a member of the AAA family (see 
<PDOC00572>) it is also in a functionally different context. 

The above proteins generally contain a single bromodomain, but some of them contain two 
copies, this is the case of BDFl, CCGl, fsh, RING3, YKROOSw and L9638.1. 

The exact function of this domain is not yet known but it is thought to be involved in protein- 
protein interactions and it may be important for the assembly or activity of multicomponent 
complexes involved in transcriptional activation. 

The consensus pattern that has been developed spans a major part of the bromodomain; a 
more sensitive detection is available through the use of a profile which spans the whole 
domain. 

Consensus pattern[STANVF]-x(2)-F-x(4)-[DNS]-x(5,7)-[DENQTF]-Y-[HFY]-x(2)- 
[LIVMFY]-x(3)-[LIVM]-x(4)-[LIVM]-x(6,8)-Y-x(12,13)-[LIVM]- 
x(2)-N-[SACF]-x(2)-[FY] 
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808. (CH) Actinin-type actin-binding domain signatures 

PROSITE cross-reference(s): PS00019; ACTININ_1, PS00020; ACTININ_2 

Alpha-actinin is a F-actin cross-linking protein which is thought to anchoractin to a variety of 
intracellular structures [1]. The actin-binding domain of alpha-actinin seems to reside in the 
first 250 residues of the protein. A similar actin-binding domain has been found in the N- 
terminal region of many different actin-binding proteins [2,3]'. 

- In the beta chain of spectrin (or fodrin). 

- In dystrophin, the protein defective in Duchenne muscular dystrophy (DMD) and which 
may play a role in anchoring the cytoskeleton to the plasma membrane. 

- In the slime mold gelation factor (or ABP-120). 

- In actin-binding protein ABP-280 (or filamin), a protein that link actin filaments to 
membrane glycoproteins. 

- In fimbrin (or plastin), an actin-bundling protein. Fimbrin differs from the above proteins in 
that it contains two tandem copies of the actin-binding domain and that these copies are 
located in the C-terminal part of the protein. 

Two conserved regions were selected as signature patterns for this type of main. The first of 
this region is located at the beginning of the domain, hile the second one is located in the 
central section and has been shown to be essential for the binding of actin. 

Consensus pattern[EQ]-x(2)-[ATV]-[FY]-x(2)-W-x-N 

Consensus pattern[LIVM]-x-[SGN]-[LIVM]-[DAGHE]-[SAG]-x-[DNEAG]-[LIVM]-x- 
[DEAG]-x(4)-[LIVM]-x-[LM]-[SAG]-[LIVM]-[LIVMT]-W-x- [LIVM](2) 
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809. (COXl) Heme-copper oxidase subunit I, copper B binding region signature 
PROSITE cross-reference(s): PS00077; COXl 

Heme-copper respiratory oxidases [1] are oligomeric integral membrane protein 
complexes that catalyze the terminal step in the respiratory chain: they 
transfer electrons from cytochrome c or a quinol to oxygen. Some terminal 
oxidases generate a transmembrane proton gradient across the plasma membrane 
(prokaryotes) or the mitochondrial inner membrane (eukaryotes). The enzyme 
complex consists of 3-4 subunits (prokaryotes) up to 13 polypeptides (mammals) 
of which only the catalytic subunit (equivalent to mammalian subunit 1 (CO I)) 
is found in all heme-copper respiratory oxidases. The presence of a bimetallic 
center (formed by a high-spin heme and copper B) as well as a low-spin heme, 
both ligated to six conserved histidine residues near the outer side of four 
transmembrane spans within CO I is common to all family members [2-4]. 

In contrary to eukaryotes the respiratory chain of prokaryotes is branched to 
multiple terminal oxidases. The enzyme complexes vary in heme and copper 
composition, substrate type and substrate affinity. The different respiratory 
oxidases allow the cells to customize their respiratory systems according a 
variety of environmental growth conditions [1]. 

Recently also a component of an anaerobic respiratory chain has been found to 
contain the copper B binding signature of this family: nitric oxide reductase 
(NOR) exists in denitrifying species of Archae and Eubacteria. 

Enzymes that belong to this family are: 

- Mitochondrial -type cytochrome c oxidase (EC 1.9.3.1) which uses cytochrome 
c as electron donor. The electrons are transferred via copper A (Cu(A)) and 
heme a to the bimetallic center of CO I that is formed by a penta- 
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coordinated heme a and copper B (Cu(B)). Subunit 1 contains 12 
transmembrane regions. Cu(B) is said to be ligated to three of the 
conserved histidine residues within the transmembrane segments 6 and 7. 

- Quinol oxidase from prokaryotes that transfers electrons from a quinol to 
the binuclear center of polypeptide I. This category of enzymes includes 
Escherichia coli cytochrome O terminal oxidase complex which is a component 
of the aerobic respiratory chain that predominates when cells are grown at 
high aeration. 

- FixN, the catalytic subunit of a cytochrome c oxidase expressed in 
nitrogen-fixing bacteroids living in root nodules. The high affinity for 
oxygen allows oxidative phosphorylation under low oxygen concentrations. A 
similar enzyme has been found in other purple bacteria. 

- Nitric oxide reductase (EC 1.7.99.7) from Pseudomonas stutzeri. NOR reduces 
nitrate to dinitrogen. It is a heterodimer of norC and the catalytic 

subunit norB. The latter contains the 6 invariant histidine residues and 12 
transmembrane segments [5]. 

As a signature pattern the copper-binding region was used. 

Consensus pattern[YWG]-[LIVFYWTA](2)-[VGS]-H-[LNP]-x-V-x(44,47)-H-H [The 
three H's are copper B ligands] 

Notecytochrome bd complexes do not belong to this family. 
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[3] 

Capaldi R.A., Malatesta P., Darley-Usmar V.M. 
Biochim. Biophys. Acta 726:135-148(1983). 
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[4] 

Holm L., Saraste M., Wikstrom M. 
EMBO J. 6:2819-2823(1987). 
[5] 

Saraste M., Castresana J. 
FEES Lett. 341:1-4(1994). 

810. (dehydrog_molyb) Eukaryotic molybdopterin oxidoreductases signature 
PROSITE cross-reference(s): PS00559; MOLYBDOPTERIN_EUK 

A number of different eukaryotic oxidoreductases that require and bind a 
molybdopterin cofactor have been shown [1] to share a few regions of sequence 
similarity. These enzymes are: 

- Xanthine dehydrogenase (EC 1.1.1.204), which catalyzes the oxidation of 
xanthine to uric acid with the concomitant reduction of NAD. Structurally, 
this enzyme of about 1300 amino acids consists of at least three distinct 
domains: an N-terminal 2Fe-2S ferredoxin-like iron-sulfur binding domain 

(see <PDOC00175>), a central FAD/NAD-binding domain and a C-terminal Mo- 
pterin domain. 

- Aldehyde oxidase (EC 1.2.3.1), which catalyzes the oxidation aldehydes into 
acids. Aldehyde oxidase is highly similar to xanthine dehydrogenase in its 
sequence and domain structure. 

- Nitrate reductase (EC 1.6.6.1), which catalyzes the reduction of nitrate 
to nitrite. Structurally, this enzyme of about 900 amino acids consists of 

an N-terminal Mo-pterin domain, a central cytochrome b5-type heme-binding 
domain (see <PDOC00170>) and a C-terminal FAD/NAD-binding cytochrome 
reductase domain. 

- Sulfite oxidase (EC 1.8.3.1), which catalyzes the oxidation of sulfite to 
sulfate. Structurally, this enzyme of about 460 amino acids consists of an 
N-terminal cytochrome b5-binding domain followed by a Mo-pterin domain. 

There are a few conserved regions in the sequence of the molybdopterin-binding 
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domain of these enzymes. The pattern uses to detect these proteins is based 
on one of them. It contains a cysteine residue which could be involved in 
binding the molybdopterin cofactor. 

Consensus pattern[GA]-x(3)-[KRNQHT]-x(ll,14)-[LIVMFYWS]-x(8)-[LIVMF]-x-C- 
x(2)-[DEN]-R-x(2)-[DE] 

[1] 

Wootton J.C., Nicolson R.E., Cock J.M., Walters D.E., Burke J.F., Doyle 
W.A., Bray R.C. 

Biochim. Biophys. Acta 1057:157-185(1991). 

811. (DNAJigase) ATP-dependent DNA ligase signatures 

PROSITE cross-reference(s): FS00697; DNA_LIGASE_A1, PS00333; DNA_LIGASE_A2 

DNA ligase (polydeoxyribonucleotide synthase) is the enzyme that joins two DNA 
fragments by catalyzing the formation of an internucleotide ester bond between 
phosphate and deoxyribose. It is active during DNA replication, DNA repair and 
DNA recombination. There are two forms of DNA ligase: one requires ATP 
(EC 6.5.1.1), the other NAD (EC 6.5.1.2). 

Eukaryotic, archaebacterial, virus and phage DNA ligases are ATP-dependent. 
During the first step of the joining reaction, the ligase interacts with ATP 
to form a covalent enzyme-adenylate intermediate. A conserved lysine residue 
is the site of adenylation [1,2]. 

Apart from the active site region, the only conserved region common to all 
ATP-dependent DNA ligases is found [3] in the C-terminal section and contains 
a conserved glutamate as well as four positions with conserved basic residues. 

Signature patterns were developed for both conserved regions. 

Consensus pattern[EDQH]-x-K-x-[DN]-G-x-R-[GACIVM] [K is the active site 
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residue] 

Consensus patternE-G-[LIVMA]-[LIVM](2)-[KR]-x(5,8)-[YW]-[QNEK]-x(2,6)- 
[KRH]-x(3,5)-K-[LIVMFY]-K 

Sequences known to belong to this class detected by the patternALL, except 
for archebacterial DNA ligases. 

[1] 

Tomkinson A.E., Totty N.F., Ginsburg M., Lindahl T. 
Proc. Natl. Acad. Sci. U.S.A. 88:400-404(1991). 
[2] 

Lindahl T., Barnes D.E. 

Annu. Rev. Biochem. 61:251-281(1992). 

[3] 

Kletzin A. 

Nucleic Acids Res. 20:5389-5396(1992). 

812. (FAD_Gly3P_dh) FAD-dependent glycerol-3-phosphate dehydrogenase signatures 
PROSITE cross-reference(s): PS00977; FAD_G3PDH_1, PS00978; FAD_G3PDH_2 

FAD-dependent glycerol-3 -phosphate dehydrogenase (EC 1.1.99.5) (GPD) catalyzes 
the conversion of glycerol-3-phosphate into dihydroxyacetone phosphate. In 
bacteria [1] it is associated with the utilization of glycerol coupled to 
respiration. In Escherichia coli, two isozymes are known: one expressed under 
anaerobic conditions (gene glpA) and one in aerobic conditions (gene glpD). In 
eukaryotes, a mitochondrial form of GPD participates in the glycerol phosphate 
shuttle in conjunction with an NAD-dependent cytoplasmic GPD (EC 1.1.1.8) [2, 
3]. 

These enzymes are proteins of about 60 to 70 Kd which contain a probable 
FAD-binding domain in their N-terminal extremity. The mammalian enzyme differs 
from the bacterial or yeast proteins by having an EF-hand calcium-binding 
region (See <PDOC00018>) in its C-terminal extremity. 
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Two signature patterns were developed. One based on the first half of the FAD- 
binding domain and one which corresponds to a conserved region in the central 
part of these enzymes. 

Consensus pattern[IV]-G-G-G-x(2)-G-[STACV]-G-x-A-x-D-x(3)-R-G 

Consensus patternG-G-K-x(2)-[GSTE]-Y-R-x(2)-A 
[1] 

Austin D., Larson T.J. 

J. Bacteriol. 173:101-107(1991). 

[2] 

Roennow B., Kielland-Brandt M.C. 

Yeast 9:1121-1130(1993). 

[3] 

Brown L.J., McDonald M.J., Lehn D.A., Moran S.M. 
J. Biol. Chem. 269:14363-14366(1994). 

813. (Fapy_DNA_glyco) Formamidopyrimidine-DNA glycosylase signature 
PROSITE cross-reference(s): PS01242; FPG 

Formamidopyrimidine-DNA glycosylase (EC 3.2.2.23) [1] (Fapy-DNA glycosylase) 
(gene fpg) is a bacterial enzyme involved in DNA repair and which excise 
oxidized purine bases to release 2,6-diamino-4-hydroxy-5N-methylformamido- 
pyrimidine (Fapy) and 7,8-dihydro-8-oxoguanine (8-OxoG) residues. In addition 
to its glycosylase activity, FPG can also nick DNA at apurinic/apyrimidinic 
sites (AP sites). FPG is a monomeric protein of about 32 Kd which binds and 
require zinc for its activity. 

The binding site for zinc seems to be located in the C-terminal part of the 

enzyme where fours conserved and essential [2] cysteines are located. A signature pattern 

was developed based on this region. 
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Consensus patternC-x(2,4)-C-x-[GTAQ]-x-[IV]-x(7)-R-[GSTAN]-[STA]-x-[FYI]-C- x(2)-C- 
Q 

[The four C's are putative zinc ligands] 
5 [1] 

Duwat P., de Oliveira R., Ehrlich S.D., Boiteux S. 

Microbiology 141:411-417(1995). 

[2] 

O'Connor T.E., Graves R.J., Demurcia G., Castaing B., Laval J. 
10 J. Biol. Chem. 268:9063-9070(1993). 

814. (G_glu_transpept) Gamma-glutamyltranspeptidase signature 
PROSITE cross-reference(s): PS00462; G_GLU_TRANSPEPTIDASE 

15 Gamma-glutamyltranspeptidase (EC 2.3.2.2) (GGT) [1] catalyzes the transfer of 
the gamma-glutamyl moiety of glutathione to an acceptor that may be an amino 
acid, a peptide or water (forming glutamate). GGT plays a key role in the 
gamma-glutamyl cycle, a pathway for the synthesis and degradation of 
glutathione. In prokaryotes and eukaryotes, it is an enzyme that consists of 

2 0 two polypeptide chains, a heavy and a light subunit, processed from a single 
chain precursor. The active site of GGT is known to be located in the light 
subunit. 

The sequences of mammalian and bacterial GGT show a number of regions of 
2 5 high similarity [2]. Pseudomonas cephalosporin acylases (EC 3.5.1.-) that 

convert 7-beta-(4-carboxybutanamido)-cephalosporanic acid (GL-7ACA) into 
7-aminocephalosporanic acid (7ACA) and glutaric acid are evolutionary related 
to GGT and also show some GGT activity [3]. Like GGT, these GL-7ACA acylases, 
are also composed of two subunits. 

30 

One of the conserved regions correspond to the N-terminal extremity of the 
mature light chains of these enzymes. This region was used as a signature 
pattern. 
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Consensus patternT-[STA]-H-x-[ST]-[LIVMA]-x(4)-G-[SN]-x-V-[STA]-x-T-x- 
[LIVM]-[NE]-x(l,2)-[FY]-G 

[1] 

Tate S.S., Meister A. 

Meth. Enzymol. 113:400-419(1985). 

[2] 

Suzuki H., Kumagai H., Echigo T., Tochikura T. 

J. Bacteriol. 171:5169-5172(1989). 

[3] 

Ishiye M., Niwa M. 

Biochim. Biophys. Acta 1132:233-239(1992). 
815. G-protein gamma subunit profile 

PROSITE cross-reference(s): PS50058; G_PROTEIN_GAMMA 

Guanine nucleotide-binding proteins (G proteins) [1] act as intermediaries in 
the transduction of signals generated by transmembrane receptors. G proteins 
consist of three subunits (alpha, beta, and gamma). The alpha subunit binds to 
and hydrolyzes GTP; the functions of the beta and gamma subunits are less 
clear but they seem to be required for the replacement of GDP by GTP as well 
as for membrane anchoring and receptor recognition. 

The gamma subunits are small proteins (from 70 to 1 10 residues) that are 
bound to the membrane via a isoprenyl group (either a farnesyl or a geranyl- 
geranyl) covalently linked to their C-terminus. In mammals there are at least 
12 different isoforms of gamma subunits. 

The Caenorhabditis elegans protein egl-10, which is a regulator of G-protein 
signalling, contains a G-protein gamma-like domain. 



A profile was developed that spans the complete length of the gamma 
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subunit. 
[1] 

Pennington S.R. 

Protein Prof. 2:16-315(1995). 

816. GNS1/SUR4 family signature 
PROSITE cross-reference(s): PS01188; GNS1_SUR4 

The following group of eukaryotic integral membrane proteins, whose exact 
function has not yet clearly been established, are evolutionary related [1]: 

- Yeast GNSl [2], a protein involved in synthesis of 1,3-beta-glucan. 

- Yeast SUR4 (or APAl, SREl) [3], a protein that could act in a glucose- 
signaling pathway that controls the expression of several genes that are 
transcriptionally regulated by glucose. 

- Yeast hypothetical protein YJL196c. 

- Caenorhabditis elegans hypothetical protein C40H1.4. 

- Caenorhabditis elegans hypothetical protein D2024.3. 

The proteins have from 290 to 435 amino acid residues. Structurally, they seem 

to be formed of three sections: a N-terminal region with two transmembrane 

domains, a central hydrophilic loop and a C-terminal region that contains from 

one to three transmembrane domains. A conserved region that contains three histidines 

selected as a signature pattern. This region is located in the 

hydrophilic loop. 

Consensus patternL-x-F-L-H-x-Y-H-H 
[1] 

Bairoch A. 

Unpublished observations (1996). 
[2] 
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El-Sherbeini M., Clemas J.A. 

J. Bacteriol. 177:3227-3234(1995). 

[3] 

Garcia-Arranz M., Maldonado A.M., Mazon M.J., Portillo F. 
J. Biol. Chem. 269:18076-18082(1994). 

817. Immunoglobulins and major histocompatibility complex proteins signature 
PROSITE cross-reference(s): PS00290; IG_MHC 

The basic structure of immunoglobulin (Ig) [1] molecules is a tetramer of two 
light chains and two heavy chains linked by disulfide bonds. There are two 
types of light chains: kappa and lambda, each composed of a constant domain 
(CL) and a variable domain (VL). There are five types of heavy chains: alpha, 
delta, epsilon, gamma and mu, all consisting of a variable domain (VH) and 
three (in alpha, delta and gamma) or four (in epsilon and mu) constant 
domains (CHI to CH4). 

The major histocompatibility complex (MHC) molecules are made of two chains. 
In class I [2] the alpha chain is composed of three extracellular domains, a 
transmembrane region and a cytoplasmic tail. The beta chain (beta-2- 
microglobulin) is composed of a single extracellular domain. In class II [3], 
both the alpha and the beta chains are composed of two extracellular domains, 
a transmembrane region and a cytoplasmic tail. 

It is known [4,5] that the Ig constant chain domains and a single 
extracellular domain in each type of MHC chains are related. These 
homologous domains are approximately one hundred amino acids long and 
include a conserved intradomain disulfide bond. A small pattern 

around the C-terminal cysteine is involved in this disulfide bond which can be used to detect 
these category of Ig related proteins. 

Consensus pattern[FY]-x-C-x-[VA]-x-H-Sequences known to belong to this 
class detected by the pattern: Ig heavy chains type Alpha C region : All, 
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in CH2 and CHS. Ig heavy chains type Deha C region : All, in CH3. Ig 
heavy chains type Epsilon C region: All, in CHI, CHS and CH4. Ig heavy 
chains type Gamma C region : All, in CHS and also CHI in some cases Ig 
heavy chains type Mu C region : All, in CH2, CHS and CH4. Ig light chains 
type Kappa C region : In all CL except rabbit and Xenopus. Ig light chains 
type Lambda C region : In all CL except rabbit. MHC class I alpha chains : 
All, in alpha-S domains, including in the cytomegalovirus MHC-1 homologous 
protein [6]. Beta-2-microglobulin : All. MHC class II alpha chains; All, 
in alpha-2 domains. MHC class II beta chains: All, in beta-2 domains. 

[1] 

Gough N. 

Trends Biochem. Sci. 6:20S-205(1981). 
[2] 

Klein J., Figueroa F. 

Immunol. Today 7:41-44(1986). 

[3] 

Figueroa F., Klein J. 

Immunol. Today 7:78-81(1986). 

[4] 

Orr H.T., Lancet D., Robb R.J., Lopez de Castro J.A., Strominger J.L. 

Nature 282:266-270(1979). 

[5] 

Cushley W., Owen M.J. 
Immunol. Today 4:88-92(1983). 
[6] 

Beck S., Barrel B.G. 
Nature SS 1:269-272(1988). 

818. (IGFBP) Insulin-like growth factor binding proteins signature 
PROSITE cross-reference(s): PS00222; IGF_BINDING 

The insulin-like growth factors (IGF-I and IGF-II) bind to specific binding 
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proteins in extracellular fluids with high affinity [1,2,3]- These IGF-binding 
proteins (IGFBP) prolong the half-life of the IGFs and have been shown to 
either inhibit or stimulate the growth promoting effects of the IGFs on cells 
culture. They seem to alter the interaction of IGFs with their cell surface 
receptors. There are at least six different IGFBPs and they are structurally 
related. 

The following growth-factor inducible proteins are structurally related to 
IGFBPs and could function as growth-factor binding proteins [4,5]: 

- Mouse protein cyr61 and its probable chicken homolog, protein CEF-10. 

- Human connective tissue growth factor (CTGF) and its mouse homolog, protein 
FISP-12. 

- Vertebrate protein NOV. 

As a signature pattern a conserved cysteine-rich region locatedin the N-terminal 
section of these proteins is used. 

Consensus patternG-C-[GS]-C-C-x(2)-C-A-x(6)-C 

Sequences known to belong to this class detected by the patternALL, except 
for IGFBP-6's. 

[1] 

Rechler M.M. 

Vitam. Horm. 47:1-114(1993). 
[2] 

Shimasaki S., Ling N. 

Prog. Growth Factor Res. 3:243-266(1991). 
[3] 

Clemmons D.R. 

Trends Endocrinol. Metab. 1:412-417(1990). 
[4] 

Bradham D.M., Igarashi A., Potter R.L., Grotendorst G.R. 
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J. Cell Biol. 114:1285-1294(1991). 
[5] 

Maloisel V., Martinerie C, Dambrine G., Plassiart G., Brisac M., Crochet 
J., Perbal B. 

Mol. Cell. Biol. 12:10-21(1992). 

819. LMWPc : Low molecular weight phosphotyrosine protein phosphatase 
Number of members: 34 

[l]Medline: 94329182, The crystal structure of a low-molecular-weight phosphotyrosine 
protein phosphatase. Su XD, Taddei N, Stefani M, Ramponi G, Nordlund P; Nature 
1994;370:575-578. 

820. (myosin_head) ATP/GTP-binding site motif A (P-loop) 
PROSITE cross-reference(s): PS00017; ATP_GTP_A 

From sequence comparisons and crystallographic data analysis it has been shown 
[1,2,3,4,5,6] that an appreciable proportion of proteins that bind ATP or GTP 
share a number of more or less conserved sequence motifs. The best conserved 
of these motifs is a glycine-rich region, which typically forms a flexible 
loop between a beta-strand and an alpha-helix. This loop interacts with one of 
the phosphate groups of the nucleotide. This sequence motif is generally 
referred to as the 'A' consensus sequence [1] or the 'P-loop' [5]. 

There are numerous ATP- or GTP-binding proteins in which the P-loop is found. 
A number of protein families for which the relevance of the 
presence of such motif has been noted is listed below: 

- ATP synthase alpha and beta subunits (see <PDOC00137>). 

- Myosin heavy chains. 

- Kinesin heavy chains and kinesin-like proteins (see <PDOC00343>). 

- Dynamins and dynamin-like proteins (see <PDOC00362>). 

- Guanylate kinase (see <PDOC00670>). 
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- Thymidine kinase (see <PDOC00524>). 

- Thymidylate kinase (see <PDOC01034>). 

- Sliikimate kinase (see <PDOC00868>). 

- Nitrogenase iron protein family (nifH/frxC) (see <PDOC00580>). 

5 - ATP-binding proteins involved in 'active transport' (ABC transporters) [7] 
(see <PDOC00185>). 

- DNA and RNA helicases [8,9,10]. 

- GTP-binding elongation factors (EF-Tu, EF-1 alpha, EF-G, EF-2, etc.). 

- Ras family of GTP-binding proteins (Ras, Rho, Rab, Ral, Yptl, SEC4, etc.). 
1 0 - Nuclear protein ran (see <PDOC00859>). 

- ADP-ribosylation factors family (see <PDOC00781>). 

- Bacterial dnaA protein (see <PDOC00771>). 

- Bacterial recA protein (see <PDOC00131>). 

- Bacterial recF protein (see <PDOC00539>). 

1 5 - Guanine nucleotide-binding proteins alpha subunits (Gi, Gs, Gt, GO, etc.). 

- DNA mismatch repair proteins mutS family (See <PDOC00388>). 

- Bacterial type II secretion system protein E (see <PDOC00567>). 

Not all ATP- or GTP-binding proteins are picked-up by this motif. A number of 
2 0 proteins escape detection because the structure of their ATP-binding site is 
completely different from that of the P-loop. Examples of such proteins are 
the E1-E2 ATPases or the glycolytic kinases. In other ATP- or GTP-binding 
proteins the flexible loop exists in a slightly different form; this is the 
case for tubulins or protein kinases. A special mention must be reserved for 
2 5 adenylate kinase, in vv^hich there is a single deviation from the P-loop 
pattern: in the last position Gly is found instead of Ser or Thr. 

Consensus pattern[AG]-x(4)-G-K-[ST] 

30 [1] 

Walker J.E., Saraste M., Runswick M.J., Gay N.J. 

EMBO J. 1:945-951(1982). 

[2] 
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Moller W., Amons R. 
FEBS Lett. 186:1-7(1985). 
[3] 

Fry D.C., Kuby S.A., Mildvan A.S. 

Proc. Natl. Acad. Sci. U.S.A. 83:907-911(1986). 

[4] 

Dever T.E., Glynias M.J., Merrick W.C. 

Proc. Natl. Acad. Sci. U.S.A. 84:1814-1818(1987). 

[5] 

Saraste M., Sibbald P.R., Wittinghofer A. 
Trends Biochem. Sci. 15:430-434(1990). 
[6] 

Koonin E.V. 

J. Mol. Biol. 229:1165-1174(1993). 
[V] 

Higgins C.F., Hyde S.C., Mimmack M.M., Gileadi U., Gill D.R., Gallagher 
M.P. 

J. Bioenerg. Biomembr. 22:571-592(1990). 
[8] 

Hodgman T.C. 

Nature 333:22-23(1988) and Nature 333:578-578(1988) (Errata). 
[9] 

Linder P., Lasko P., Ashburner M., Leroy P., Nielsen P.J., Nishi K., 
Schnier J., Slonimski P.P. 
Nature 337:121-122(1989). 
[10] 

Gorbalenya A.E., Koonin E.V., Donchenko A.P., Blinov V.M. 
Nucleic Acids Res. 17:4713-4730(1989). 

821. PE: PE family 

This family named after a PE motif near to the amino terminus of the domain. The PE family 
of proteins all contain an amino-terminal region of about 110 amino acids. The carboxyl 
terminus of this family are variable and fall into several classes. The largest class of PE 
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proteins is the highly repetitive PGRS class which have a high glycine content. The function 
of these proteins is uncertain but it has been suggested that they may be related to antigenic 
variation of Mycobacterium tuberculosis [1]. Number of members: 88 

[1] Medline: 98295987. Deciphering the biology of Mycobacterium tuberculosis from the 
complete genome sequence. Cole ST, Brosch R, Parkhill J, Garnier T, Churcher C, Harris D, 
Gordon SV, Eiglmeier K, Gas S, Barry CE 3rd, Tekaia F, Badcock K, Basham D, Brown D, 
Chillingworth T, Connor R, Davies R, Devlin K, Feltwell T, Gentles S, Hamlin N, Holroyd 
S, Hornsby T, Jagels K, Barrell BG, et al; Nature 1998;393:537-544. 

822. (RNB) Ribonuclease II family signature 
PROSITE cross-reference(s): PS01175; RIBONUCLEASEJI 

On the basis of sequence similarities, the following bacterial and eukaryotic 
proteins seem to form a family: 

- Escherichia coli and related bacteria ribonuclease II (EC 3.1.13.1) (RNase 
II) (gene rnb) [1]. RNase II is an exonuclease involved in mRNA decay. It 
degrades mRNA by hydrolyzing single-stranded polyribonucleotides 
processively in the 3' to 5' direction. 

- Bacterial protein vacB. In Shigella flexneri, vacB has been shown to be 
required for the expression of virulence genes at the posttranscriptional 
level. 

- Yeast protein SSDl (or SRKl) which is implicated in the control of the cell 
cycle Gl phase. 

- Yeast protein DIS3 [2], which binds to ran (GSPl) and chances the the 
nucleotide-releasing activity of RCCl on ran. 

- Fission yeast protein dis3, which is implicated in mitotic control. 

- Neurospora crassa cyt-4, a mitochondrial protein required for RNA 5' and 3' 
end processing and splicing. 

- Yeast protein MSUl, which is involved in mitochondrial biogenesis. 

- Synechocystis strain PCC 6803 protein zam [3], which control resistance to 
the carbonic anhydrase inhibitor acetazolamide. 
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- Caenorhabditis elegans hypothetical protein F48E8.6. 

The size of these proteins range from 644 residues (rnb) to 1250 (SSDl). While 
their sequence is highly divergent they share a conserved domain in their C- 
terminal section [4]. It is possible that this domain plays a role in a 

putative exonuclease function that would be common to all these proteins. A signature pattern 
was developed based on the core of this conserved domain. 

Consensus pattern[HI]-[FYE]-[GSTAM]-[LIVM]-x(4,5)-Y-[STAL]-x-[FWVAC]-[TV]- 
[SA]-P-[LIVMA]-[RQ]-[KR]-[FY]-x-D-x(3)-[HQ] 

[1] 

Zilhao R., Camelo L., Arraiano CM. 
Mol. Microbiol. 8:43-51(1993). 
[2] 

Noguchi E., Hayashi N., Azuma Y., Seki T., Nakamura M., Nakashima N., 
Yanagida M., He X., Mueller U., Sazer S., Nishimoto T. 
EMBO J. 15:5595-5605(1996). 
[3] 

Beuf L., Bedu S., Cami B., Joset F. 
Plant Mol. Biol. 27:779-788(1995). 
[4] 

Mian I.S. 

Nucleic Acids Res. 25:3187-3195(1997). 

823. Src homology 2 (SH2) domain profile 
PROSITE cross-reference(s): PS50001; SH2 

The Src homology 2 (SH2) domain is a protein domain of about 100 amino-acid 
residues first identified as a conserved sequence region between the 
oncoproteins Src and Fps [1]. Similar sequences were later found in many other 
intracellular signal-transducing proteins [2]. SH2 domains function as 
regulatory modules of intracellular signalling cascades by interacting with 
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high affinity to phosphotyrosine-containing target peptides in a sequence- 
specific and strictly phosphorylation-dependent manner [3,4,5,6]. 

The SH2 domain has a conserved 3D structure consisting of two alpha helices 
and six to seven beta-strands. The core of the domain is formed by a 
continuous beta-meander composed of two connected beta-sheets [7]. 

So far, SH2 domains have been identified in the following proteins: 

- Many vertebrate, invertebrate and retroviral cytoplasmic (non-receptor) 
protein tyrosine kinases. In particular in the Src, Abl, Bkt, Csk and ZAP70 
families of kinases. 

- Mammalian phosphatidylinositol-specific phospholipase C gamma-1 and -2. Two 
copies of the SH2 domain are found in those proteins in between the 

catalytic 'X-' and T-boxes' (see <PDOC50007>). 

- Mammalian phosphatidyl inositol 3-kinase regulatory p85 subunit. 

- Some vertebrate and invertebrate protein-tyrosine phosphatases. 

- Mammalian Ras GTPase-activating protein (GAP). 

- Adaptor proteins mediating binding of guanine nucleotide exchange factors 
to growth factor receptors: vertebrate GRB2, Caenorhabditis elegans sem-5 
and Drosophila DRK. 

- Mammalian Vav oncoprotein, a guanine-nucleotide exchange factor of the 
CDC24 family. 

- Miscellanous proteins interacting with vertebrate receptor protein 
tyrosine kinases: oncoprotein Crk, mammalian cytoplasmic proteins Nek, She. 

- STAT proteins (signal transducers and activators of transcription). 

- Chicken tensin. 

- Yeast transcriptional control protein SPT6. 

The profile developed to detect SH2 domains is based on a structural alignment 
consisting of 8 gap-free blocks and 7 linker regions totaling 92 match 
positions. 
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[1] 

Sadowski I., Stone J.C., Pawson T. 
Mol. Cell. Biol. 6:4396-4408(1986). 
[2] 

Russel R.B., Breed J., Barton G.J. 
FEBS Lett. 304:15-20(1992). 
[3] 

Marangere L.E.M., Pawson T. 

J. Cell Sci. Suppl. 18:97-104(1994). 

[4] 

Pawson T., Schlessinger J. 
Curr. Biol. 3:434-442(1993). 
[5] 

Mayer B.J., Baltimore D. 
Trends Cell. Biol. 3:8-13(1993). 
[6] 

Pawson T. 

Nature 373:573-580(1995). 
[7] 

Kuriyan J., Cowburn D. 

Curr. Opin. Struct. Biol. 3:828-837(1993). 

824. Sulfate transporters signature 

PROSITE cross-reference(s): PS01130; SULFATE_TRANSP 

A number of proteins involved in the transport of sulfate across a membrane 
as well as some yet uncharacterized proteins have been shown [1,2] to be 
evolutionary related. These proteins are: 

- Neurospora crassa sulfate permease II (gene cys-14). 

- Yeast sulfate permeases (genes SULl and SUL2). 

- Rat sulfate anion transporter 1 (SAT-1). 

- Mammalian DTDST, a probable sulfate transporter which, in Human, is 
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involved in the genetic disease, diastrophic dysplasia (DTD). 

- Sulfate transporters 1, 2 and 3 from the legume Stylosanthes hamata. 

- Human pendrin (gene PDS), which is involved in a number of hearing loss 
genetic diseases. 

- Human protein DRA (Down-Regulated in Adenoma). 

- Soybean early nodulin 70. 

- Escherichia coli hypothetical protein ychM. 

- Caenorhabditis eiegans hypothetical protein F41D9.5. 

As expected by their transport function, these proteins are highly hydrophobic 
and seem to contain about 12 transmembrane domains. The best conserved region 
seems to be located in the second transmembrane region and is used as a 
signature pattern. 

Consensus pattern[PAV]-x-Y-[GS]-L-Y-[STAG](2)-x(4)-[LIVFYA]-[LIVST]-[YI]- 
x(3)-[GA]-[GST]-S-[KR] 

[1] 

Sandal N.N., Marcker K.A. 

Trends Biochem. Sci. 19:19-19(1994). 

[2] 

Smith F.W., Hawkesford M.J., Prosser I.M., Clarkson D.T. 
Mol. Gen. Genet. 247:709-715(1995). 

825. TYA: TYA transposon protein 

Ty are yeast transposons. A 5.7kb transcript codes for p3 a fusion protein of TYA and TYB. 
The TYA protein is analogous to the gag protein of retroviruses. TYA a is cleaved to form 
46kd protein which can form mature virion like particles [1]. Number of members: 59 

[1] Medline: 97404699. Cryo-electron microscopy structure of yeast Ty retrotransposon 
virus-like particles. Palmer KJ, Tichelaar W, Myers N, Burns NR, Butcher SJ, Kingsman AJ, 
Fuller SD, Saibil HR; J Virol 1997;71:6863-6868. 
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826. AMolaseJI 

Class II Aldolase and Adducin N-terminal domain. 

-!- This family includes class II aldolases and adducins which have not been ascribed any 
enzymatic function. Number of members; 37 

References: 

[1] Medline: 93294819. The spatial structure of the class II L-fuculose-1 -phosphate aldolase 

from Escherichia coli. Dreyer MK, Schulz GE: J Mol Biol 1993;231:549-553. 

[2] Medline: 96256522. Catalytic mechanism of the metal-dependent fuculose aldolase from 

Escherichia coli as derived from the structure. Dreyer MK, Schulz GE; J Mol Biol 

1996;259:458-466. 

827. CBD_2 

-!- Two tryptophan residues are involved in cellulose binding. 

-!- Cellulose binding domain found in bacteria. Number of members: 51 

References: 

[1] Medline: 95284032. Solution structure of a cellulose-binding domain from Cellulomonas 
fimi by nuclear magnetic resonance spectroscopy. Xu GY, Ong E, Gilkes NR, Kilburn DG, 
Muhandiram DR, Harris-Brandts M, Carver JP, Kay LE, Harvey TS; Biochemistry 
1995;34:6993-7009. 

828. P 

A unique feature of the eukaryotic subtilisin-like proprotein convertases is the presence of an 
additional highly conserved sequence of approximately 150 residues (P domain) located 
immediately downstream of the catalytic domain. 
Number of members: 91 

References: 

[1] Medline: 94252314. A C-terminal domain conserved in precursor processing proteases is 
required for intramolecular N-terminal maturation of pro-Kex2 protease. Gluschankof P, 
Fuller RS; EMBO J 1994;13:2280-2288. 
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[2] Medline: 98225190. Regulatory roles of the P domain of the subtilisin-like prohormone 
convertases. Zhou A, Martin S, Lipkind G, LaMendola J, Steiner DF; J Biol Chem 
1998;273:11107-11114. 

829. Uncharacterized protein family UPF0020 signature 
PROSITE cross-reference(s): PS01261; UPF0020 

The following uncharacterized proteins have been shown [1] to share regions of 
similarities: 

- Escherichia coli hypothetical protein ycbY and HI0116/15, the corresponding Haemophih 
influenzae protein. 

- Bacillus subtilis hypothetical protein ypsC. 

- Synechocystis strain PCC 6803 hypothetical protein slr0064. 

- Methanococcus jannaschii hypothetical proteins MJ0438 and MJ0710. 

These are hydrophilic proteins of from 40 Kd to about 80 Kd. They can be 
picked up in the database by the following pattern. 

Consensus patternD-P-[LIVMF]-C-G-[ST]-G-x(3)-[LI]-E 

References: 

[ 1] Bairoch A. Unpublished observations (1997). 

830. Uncharacterized protein family UPF0031 signatures 

PROSITE cross-reference(s): PS01049: UPF0031_1; PS01050; UPF0031_2 

The following uncharacterized proteins have been shown [1] to share regions of 

similarities: 

- Yeast chromosome XI hypothetical protein YKL151c. 

- Caenorhabditis elegans hypothetical protein R107.2. 

- Escherichia coli hypothetical protein yjeF. 

- Bacillus subtilis hypothetical protein yxkO. 

- Helicobacter pylori hypothetical protein HP1363. 
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- Mycobacterium tuberculosis hypothetical protein MtCY77.05c. 

- Mycobacterium leprae hypothetical protein B229_C2_201. 

- Synechocystis strain PCC 6803 hypothetical protein slll433. 

- Methanococcus jannaschii hypothetical protein MJ1586. 

These are proteins of about 30 to 40 Kd whose central region is well 
conserved. They can be picked up in the database by the following patterns. 

Consensus pattern[SAV]-[IVW]-[LVA]-[LlV]-G-[PNS]-G-L-[GP]-x-[DENQT] 
Consensus pattern[GA]-G-x-G-D-[TV]-[LT]-[STA]-G-x-[LIVM] 

831. (ACOX) 
Acyl-CoA oxidase 

This is a family of Acyl-CoA oxidases EC: 1.3.3.6. Acyl-coA oxidase converts acyl-CoA into 
trans-2-enoyl-CoA [1]. 

Number of members: 39 

[1] Hayashi H, De Bellis L, Yamaguchi K, Kato A, Hayashi M, Nishimura M; Medline: 
98192624. "Molecular characterization of a glyoxysomal long chain acyl-CoA oxidase that is 
synthesized as a precursor of higher molecular mass in pumpkin." J Biol Chem 
1998;273:8301-8307. 

832. (AICARFT_IMPCHas) 
AICARFT/IMPCHase bienzyme 

This is a family of bifunctional enzymes catalysing the last steps in de novo purine 
biosynthesis. The bifunctional enzyme is found in both prokaryotes and eukaryotes. The 
second last step is catalysed by 5-aminoimidazole-4-carboxamide ribonucleotide 
formyltransferase EC:2. 1.2.3 (AICARFT), this enzyme catalyses the formylation of AICAR 
with 10-formyl-tetrahydrofolate to yield FAICAR and tetrahydrofolate [1]. The last step is 
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catalysed by IMP (Inosine monophosphate) cyclohydrolase EC:3.5.4.10 (IMPCHase), 
cyclizing FAICAR (5-formylaminoimidazole-4-carboxamide ribonucleotide) to IMP [1]. 

Number of members: 22 

5 

[1] Akira T, Komatsu M, Nango R, Tomooka A, Konaka K, Yamauchi M, Kitamura Y, 
Nomura S, Tsukamoto I; Medline: 97473523 "Molecular cloning and expression of a rat 
cDNA encoding 5-aminoimidazole-4-carboxamide ribonucleotide formyltransferase/IMP 
cyclohydrolase" [published erratum appears in Gene 1998 Feb 27;208(2):337] Gene 
10 1997;197:289-293. 

[2] Rayl EA, Moroson BA, Beardsley GP; Medline: 96147205 "The human purH gene 
product, 5-aminoimidazole-4-carboxamide ribonucleotide formyltransferase/IMP 
cyclohydrolase. Cloning, sequencing, expression, purification, kinetic analysis, and domain 
mapping." J Biol Chem 1996;271:2225-2233. 

15 

833. (AOX) 
Alternative oxidase 

2 0 The alternative oxidase is used as a second terminal oxidase in the mitochondria, electrons 

are transfered directly from reduced ubiquinol to oxygen forming water [2]. This is not 
coupled to ATP synthesis and is not inhibited by cyanide, this pathway is a single step 
process [1]. In rice the transcript levels of the alternative oxidase are increased by low 
temperature [1]. 

25 

Number of members: 27 

[1] Ito Y, Saisho D, Nakazono M, Tsutsumi N, Hirai A; Medline: 98086211 "Transcript 
levels of tandem-arranged alternative oxidase genes in rice are increased by low 

3 0 temperature." Gene 1997;203:121-129. 
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[2] Li Q, Ritzel RG, McLean LL, Mcintosh L, Ko T, Bertrand H, Nargang FE; Medline: 
96366413 "Cloning and analysis of the alternative oxidase gene of Neurospora crassa." 
Genetics 1996;142:129-140. 

834. (APH) 

Protein kinases signatures and profile 

Cross-reference(s): PS00107; PROTEIN_KINASE_ATP, PS00108; 
PROTEIN_KINASE_ST, PS00109; PROTEIN_KINASE_TYR, PS50011; 
PROTEIN_KINASE_DOM 

Eukaryotic protein kinases [1 to 5] are enzymes that belong to a very extensive family of 
proteins which share a conserved catalytic core common to both serine/threonine and tyrosine 
protein kinases. There are a number of conserved regions in the catalytic domain of protein 
kinases. Two of these regions have been selected to build signature patterns. The first region, 
which is located in the N-terminal extremity of the catalytic domain, is a glycine-rich stretch 
of residues in the vicinity of a lysine residue, which has been shown to be involved in ATP 
binding. The second region, which is located in the central part of the catalytic domain, 
contains a conserved aspartic acid residue which is important for the catalytic activity of the 
enzyme [6]; two signature patterns were derived for that region: one specific for serine/ 
threonine kinases and the other for tyrosine kinases. A profile was developed which is based 
on the alignment in [1] and covers the entire catalytic domain. 

Consensus pattern: [LIV]-G-{P}-G-{P}-[FYWMGSTNH]-[SGA]-{PW}-[LIVCAT]-{PD}-x- 
[GSTACLIVMFY]-x(5,18)-[LIVMFYWCSTAR]-[AIVP]-[LIVMFAGCKR]-K [K binds 
ATP] 

Sequences known to belong to this class detected by the pattern the majority of known 
protein kinases but it fails to find a number of them, especially viral kinases which are quite 
divergent in this region and are completely missed by this pattern. 
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Consensus pattern: [LIVMFYC]-x-[HY]-x-D-[LIVMFY]-K-x(2)-N-[LIVMFYCT](3) [D is 
an active site residue] 

Sequences known to belong to this class detected by the pattern. Most serine/ threonine 
specific protein kinases with 10 exceptions (half of them viral kinases) and also Epstein-Barr 
virus BGLF4 and Drosophila ninaC which have respectively Ser and Arg instead of the 
conserved Lys and which are therefore detected by the tyrosine kinase specific pattern 
described below. 

Consensus pattern: [LIVMFYC]-x-[HY]-x-D-[LIVMFY]-[RSTAC]-x(2)-N-[LIVMFYC](3) 
[D is an active site residue] tyrosine specific protein kinases with the exception of human 
ERBB3 and mouse blk. This pattern will also detect most bacterial aminoglycoside 
phosphotransferases [8,9] and herpesviruses ganciclovir kinases [10]; which are proteins 
structurally and evolutionary related to protein kinases. Sequences known to belong to this 
class detected by the profile ALL, except for three viral kinases. This profile also detects 
receptor guanylate cyclases (see <PDOC00430>) and 2-5A-dependent ribonucleases. 
Sequence similarities between these two families and the eukaryotic protein kinase family 
have been noticed before. It also detects Arabidopsis thaliana kinase- like protein TMKLl 
which seems to have lost its catalytic activity. 

Note if a protein analyzed includes the two protein kinase signatures, the probability of it 
being a protein kinase is close to 100%. Note eukaryotic-type protein kinases have also been 
found in prokaryotes such as Myxococcus xanthus [11] and Yersinia pseudotuberculosis. 
Note the patterns shown above has been updated since their publication in [7]. Note this 
documentation entry is linked to both signature patterns and a profile. As the profile is much 
more sensitive than the patterns, you should use it if you have access to the necessary 
software tools to do so. 

References 

[ 1] Hanks S.K., Hunter T., FASEB J. 9:576-596(1995). 

[ 2] Hunter T., Meth. Enzymol. 200:3-37(1991). 

[ 3] Hanks S.K., Quinn A.M., Meth. Enzymol. 200:38-62(1991). 

[ 4] Hanks S.K., Curr. Opin. Struct. Biol. 1:369-383(1991). 
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[ 5] Hanks S.K., Quinn A.M., Hunter T., Science 241:42-52(1988). 

[ 6] Knighton D.R., Zheng J., Ten Eyck L.F., Ashford V.A., Xuong N.-H., Taylor, S.S., 

Sowadski J.M., Science 253:407-414(1991). 

[ 7] Bairoch A., Claverie J.-M., Nature 331:22(1988). 

[ 8] Benner S., Nature 329:21-21(1987). 

[ 9] Kirby R., J. Mol. Evol. 30:489-492(1992). 

[10] Littler E., Stuart A.D., Chee M.S., Nature 358:160-162(1992). 

[11] Munoz-Dorado J., Inouye S., Inouye M., Cell 67:995-1006(1991). 

835. (Asp_Glu_race) 

Aspartate and glutamate racemases signatures 

Cross-reference(s) PS00923; ASP_GLU_RACEMASE_1 PS00924; 
ASP_GLU_RACEMASE_2 

Aspartate racemase (EC 5.1.1.13) and glutamate racemase (EC 5.1.1.3) are two evolutionary 
related bacterial enzymes that do not seem to require a cofactor for their activity [1]. 
Glutamate racemase, which interconverts L-glutamate into D-glutamate, is required for the 
biosynthesis of peptidoglycan and some peptide -based antibiotics such as gramicidin S. In 
addition to characterized aspartate and glutamate racemases, this family also includes a 
hypothetical protein from Erwinia carotovora and one from Escherichia coli (ygeA). Two 
conserved cysteines are present in the sequence of these enzymes. They are expected to play 
a role in catalytic activity by acting as bases in proton abstraction from the substrate. 
Signature patterns were developed for both cysteines. 

Consensus pattern: [IVA]-[LIVM]-x-C-x(0,l)-N-[ST]-[MSA]-[STH]-[LIVFYSTANK] 
Consensus pattern: [LIVM](2)-x-[AG]-C-T-[DEH]-[LIVMFY]-[PNGRS]-x-[LIVM] 



[ 1] Gallo K.A., Knowles J.R., Biochemistry 32:3981-3990(1993). 
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836. (ATP-sulfurylase) 
ATP-sulfurylase 

This family consists of ATP-sulfurylase or sulfate adenylyltransferase EC:2.7.7.4 some of 
which are part of a bifunctional polypeptide chain associated with adenosyl phosphosulphate 
(APS) kinase APS_kinase. Both enzymes are required for PAPS (phosphoadenosine- 
phosphosulfate) synthesis from inorganic sulphate [2]. ATP sulfurylase catalyses the 
synthesis of adenosine-phosphosulfate APS from ATP and inorganic sulphate [1]. 

Number of members: 37 

[1] Kurima K, Warman ML, Krishnan S, Domowicz M, Krueger RC Jr, Deyrup A, Schwartz 
NB; Medline: 98337975 "A member of a family of sulfate-activating enzymes causes murine 
brachymorphism" [published erratum appears in Proc Natl Acad Sci USA 1998 Sep 
29;95(20): 12071] Proc Natl Acad Sci U S A 1998;95:8681-8685. 

[2] Rosenthal E, Leustek T; Medline: 96096529 "A multifunctional Urechis caupo protein, 
PAPS synthetase, has both ATP sulfurylase and APS kinase activities." Gene 1995;165:243- 
248. 



837. (ATP-synt_F) 

ATP synthase (F/14-kDa) subunit 

This family includes 14-kDa subunit from vATPases [1], which is in the peripheral catalytic 
part of the complex [2]. The family also includes archaebacterial ATP synthase subunit F [3]. 

Number of members: 23 

[1] Guo Y, Kaiser K, Wieczorek H, Dow JA; Medline: 96269411 "The Drosophila 
melanogaster gene vhal4 encoding a 14-kDa F-subunit of the vacuolar ATPase." Gene 
1996;172:239-243. 
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[2] Peng SB, Crider BP, Tsai SJ, Xie XS, Stone DK; Medline: 96216416 "Identification of a 
14-kDa subunit associated with the catalytic sector of clathrin-coated vesicle H+-ATPase." J 
Biol Chem 1996;271:3324-3327. 

[3] Wilms R, Freiberg C, Wegerle E, Meier I, Mayer F, Muller V; Medline: 96324968 
"Subunit structure and organization of the genes of the AlAO ATPase from the Archaeon 
Methanosarcina mazei Gol." J Biol Chem 1996;271:18843-18852. 

838. (CBD_4) 
Starch binding domain 

Number of members: 48 

839. (CbiX) 

The function of CbiX is uncertain, however it is found in cobalamin biosynthesis operons and 
so may have a related function. Some CbiX proteins contain a striking histidine-rich region at 
their C-terminus, which suggests that it might be involved in metal chelation [1]. 

Number of members: 6 

[1] Raux E, Lanois A, Warren MJ, Rambach A, Thermes C; Medline: 98416126 "Cobalamin 
(vitamin B12) biosynthesis: identification and characterization of a Bacillus megaterium cobi 
operon." Biochem J 1998;335:159-166. 

840. (Complex 1_5 IK) 

Respiratory-chain NADH dehydrogenase 51 Kd subunit signatures Cross-reference(s) 
PS00644; COMPLEXl_51K_l PS00645; COMPLEXl_51K_2 
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Respiratory-chain NADH dehydrogenase (EC 1.6.5.3) [1,2] (also known as complex I or 
NADH-ubiquinone oxidoreductase) is an oligomeric enzymatic complex located in the inner 
mitochondrial membrane which also seems to exist in the chloroplast and in cyanobacteria 
(as a NADH-plastoquinone oxidoreductase). Among the 25 to 30 polypeptide subunits of this 
bioenergetic enzyme complex there is one with a molecular weight of 51 Kd (in mammals), 
which is the second largest subunit of complex I and is a component of the iron-sulfur (IP) 
fragment of the enzyme. It seems to bind to NAD, FMN, and a 2Fe-2S cluster. 

The 51 Kd subunit is highly similar to [3,4]: 

- Subunit alpha of Alcaligenes eutrophus NAD-reducing hydrogenase (gene hoxF) which 
also binds to NAD, FMN, and a 2Fe-2S cluster. 

- Subunit NQOl of Paracoccus denitrificans NADH-ubiquinone oxidoreductase. 

- Subunit F of Escherichia coli NADH-ubiquinone oxidoreductase (gene nuoF). 

The 51 Kd subunit and the bacterial hydrogenase alpha subunit contains three regions of 
sequence similarities. The first one most probably corresponds to the NAD-binding site, the 
second to the FMN-binding site, and the third one, which contains three cysteines, to the iron- 
sulfur binding region. Signature patterns have been developed for the FMN-binding and for 
the 2Fe-2S binding regions. 

Consensus pattern: G-[AM]-G-[AR]-Y-[LIVM]-C-G-[DE](2)-[STA](2)-[LIM](2)-[EN]- S 
Consensus pattern: E-S-C-G-x-C-x-P-C-R-x-G [The three C's are putative 2Fe-2S ligands] 

[ 1] Ragan C.I., Curr. Top. Bioenerg. 15:1-36(1987). 

[ 2] Weiss H., Friedrich T., Hofhaus G., Preis D., Eur. J. Biochem. 197:563-576(1991). 

[ 3] Fearnley LM., Walker J.E. Biochim. Biophys. Acta 1140:105-134(1992). 

[ 4] Weidner U., Geier S., Ptock A., Friedrich T., Leif H., Weiss H., J. Mol. Biol. 233:109- 

122(1993). 

841. (DAP_epimerase) 
Diaminopimelate epimerase signature 
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Cross-reference(s) PS01326; DAP_EPIMERASE 

Diaminopimelate epimerase (EC 5.1.1.7) catalyzes the isomeriazation of L,L- to D,L-meso- 
diaminopimelate in the biosynthetic pathway leading from aspartate to lysine. This enzyme is 
a protein of about 30 Kd. Two conserved cysteines seem [1] to function as the acid and base 
in the catalytic mechanism. As a signature pattern, the region surrounding the first of these 
two active site cysteines were selected. 

Consensus pattern: N-x-D-G-S-x(4)-C-G-N-[GA]-x-R [C is an active site residue] Sequences 
known to belong to this class detected by the pattern ALL, except for an Anabaena dapF 
which has a Ser instead of the active site Cys. 

[ 1] Cirilli M., Zheng R., Scapin G., Bianchard J.S., Biochemistry 37:16452-16458(1998). 

842. (DNA_gyraseB_C) 

DNA topoisomerase II signature 

Cross-reference(s) PS00177; TOPOISOMERASEJI 

DNA topoisomerase I (EC 5.99.1.2) [1,2,3,4,E1] is one of the two types of enzyme that 
catalyze the interconversion of topological DNA isomers. Type II topoisomerases are ATP- 
dependent and act by passing a DNA segment through a transient double-strand break. 
Topoisomerase II is found in phages, archaebacteria, prokaryotes, eukaryotes, and in 
African Swine Fever virus (ASF). In bacteriophage T4 topoisomerase II consists of three 
subunits (the product of genes 39, 52 and 60). In prokaryotes and in archaebacteria the 
enzyme, known as DNA gyrase, consists of two subunits (genes gyrA and gyrB [E2]). In 
some bacteria, a second type II topoisomerase has been identified; it is known as 
topoisomerase IV and is required for chromosome segregation, it also consists of two 
subunits (genes parC and parE). In eukaryotes, type II topoisomerase is a homodimer. 

There are many regions of sequence homology between the different subtypes of 
topoisomerase II. The relation between the different subunits is shown in the following 
representation: 
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Position of the pattern. 

As a signature pattern for this family of proteins, a region that contains a highly conserved 
pentapeptide was selected. The pattern is located in gyrB, in parE, and in protein 39 of phage 
T4 topoisomerase. 

Consensus pattern: [LIVMA]-x-E-G-[DN]-S-A-x-[STAG] 

[ 1] Sternglanz R., Curr. Opin. Cell Biol. 1:533-535(1990). 

[ 2] Bjornsti M.-A., Curr. Opin. Struct. Biol. 1:99-103(1991). 

[ 3] Sharma A., Mondragon A., Curr. Opin. Struct. Biol. 5:39-47(1995). 

[ 4] Roca J., Trends Biochem. Sci. 20:156-160(1995). 



843. (DUF16) 



Protein of unknown function 



The function of this protein is unknown. It appears to only occur in Mycoplasma 



pneumoniae. 



Number of members: 26 



[1] Himmelreich R, Hilbert H, Plagens H, Pirkl E, Li BC, Herrmann R; Medline: 97105885 
"Complete sequence analysis of the genome of the bacterium Mycoplasma pneumoniae." 
Nucleic Acids Res 1996;24:4420-4449. 
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844. (DUF21) 

Domain of unknown function 

This transmembrane region lias no known function. Many of tlie sequences in this family are 
annotated as hemolysins, however this is due to a similarity to Swiss:Q54318 that does not 
contain this domain. This domain is found in the N-terminus of the proteins adjacent to two 
intracellular CBS domains CBS. 

Number of members: 42 

845. (DUF56) 

Integral membrane protein 

The members of this family are putative integral membrane proteins. The function of the 
family is unknown, however the family includes Sec59 from yeast. Sec59 is a dolichol 
kinase EC:2.7.f .f 08, but it is not clear if the enzymatic activity resides in this region or its N 
terminal region. 

Number of members: 13 

846. (DUF94) 

Domain of unknown function 

The function of this domain is unknown. It is found in both eukaryotes and archaebacteria. 
The alignment contains a completely conserved aspartate residue that may be functionally 
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important. The eukaryotic domains contains three conserved cysteines and a histidine that 
might be metal binding, however these are absent in the archaebacterial proteins. 

Number of members: 9 

847. (FF) 
FF domain 

This domain may be involved in protein-protein interaction [1]. 
Number of members: 42 

[1] Bedford MT, Leder P; Medline: 99322199 "The FF domain: a novel motif that often 
accompanies WW domains." Trends Biochem Sci 1999;24;264-265. 

848. (FLO_LFY) 
Floricaula / Leafy protein 

This family consists of various plant development proteins which are homologues of 
floricaula (FLO) and Leafy (LFY) proteins which are floral meristem identity proteins. 
Mutations in the sequences of these proteins affect flower and leaf development. 

Number of members: 16 

[1] Hofer J, Turner L, Hellens R, Ambrose M, Matthews P, Michael A, Ellis N; Medline: 
97411151 "UNIFOLIATA regulates leaf and flower morphogenesis in pea." Curr Biol 
1997;7:581-587. 

[2] Weigel D, Alvarez J, Smyth DR, Yanofsky MF, Meyerowitz EM; Medline: 92274452 
"LEAFY controls floral meristem identity in Axabidopsis." Cell 1992;69:843-859. 
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849. (G-patch) 
G-patch domain 

This domain is found in a number of RNA binding proteins, and is also found in proteins that 
contain RNA binding domains. This suggests that this domain may have an RNA binding 
function. This domain has seven highly conserved glycines. 

Number of members: 47 

[1] Aravind L, Koonin EV; Medline: 10470032 "G-patch: a new conserved domain in 
eukaryotic RNA-processing proteins and type D retroviral polyproteins." Trends Biochem 
Sci 1999;24:342-344. 



850. (Gram-ve_porins) 

General diffusion Gram-negative porins signature 
Cross-reference(s) PS00576; GRAM_NEG_PORIN 

The outer membrane of Gram-negative bacteria acts as a molecular filter for hydrophilic 
compounds. Proteins, known as porins [1], are responsible for the 'molecular sieve' properties 
of the outer membrane. Porins form large water- filled channels which allows the diffusion of 
hydrophilic molecules into the periplasmic space. Some porins form general diffusion 
channels that allows any solutes up to a certain size (that size is known as the exclusion limit) 
to cross the membrane, while other porins are specific for a solute and contain a binding site 
for that solute inside the pores (these are known as selective porins). As porins are the major 
outer membrane proteins, they also serve as receptor sites for the binding of phages and 
bacteriocins. General diffusion porins generally assemble as trimer in the membrane and the 
transmembrane core of these proteins is composed exclusively of beta strands [2]. It has been 
shown [3] that a number of general porins are evolutionary related, these porins are: 

- Enterobacteria phoE. 

- Enterobacteria ompC. 

- Enterobacteria ompF. 
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- Enterobacteria nmpC. 

- Bacteriophage PA-2 LC. 

- Neisseria PI.A. 

- Neisseria PLB. 

As a signature pattern a conserved region was selected, located in the C-terminal part of these 
proteins, which spans two putative transmembrane beta strands. 

Consensus pattern: [LIVMFY]-x(2)-G-x(2)-Y-x-F-x-K-x(2)-[SN]-[STAV]-[LIVMFYW]- V 

[1] Benz R., Bauer K., Eur. J. Biochem. 176:1-19(1988). 

[2] Jap B.K., Walian P.J., Q. Rev. Biophys. 23:367-403(1990). 

[3] Jeanteur D., Lakey J.H., Pattus F., Mol. Microbiol. 5:2153-2164(1991). 

851. (HlyD) 

HlyD family secretion proteins signature 
Cross-reference(s) PS00543; HLYD_FAMILY 

Gram-negative bacteria produce a number of proteins which are secreted into the growth 
medium by a mechanism that does not require a cleaved N-terminal signal sequence. These 
proteins, while having different functions, require the help of two or more proteins for their 
secretion across the cell envelope. Amongst which a protein belonging to the ABC 
transporters family (see the relevant entry <PDOC00185>) and a protein belonging to a 
family which is currently composed [1 to 5] of the following members: 
Gene Species Protein which is exported 



hlyD Escherichia coli Hemolysin 
appD A.pleuropneumoniae Hemolysin 
IcnD Lactococcus lactis Lactococcin A 
IktD A.actinomycetemcomitans Leukotoxin 

Pasteurella haemolytica 
rtxD A.pleuropneumoniae Toxin-III 
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cyaD Bordetella pertussis Calmodulin-sensitive adenylate cyclase- 
hemolysin (cyclolysin) 
cvaA Escherichia coli Colicin V 

prtE Erwinia chrysanthemi Extracellular proteases B and C 
aprE Pseudomonas aeruginosa Alkaline protease 
emrA Escherichia coli Drugs and toxins 
yjcR Escherichia coli Unknown 

These proteins are evolutionary related and consist of from 390 to 480 amino acid residues. 
They seem to be anchored in the inner membrane by a N-terminal transmembrane region. 
Their exact role in the secretion process is not yet known. The C-terminal section of these 
proteins is the best conserved region; a signature pattern from that region was derived. 

Consensus pattern: [LIVM]-x(2)-G-[LM]-x(3)-[STGAV]-x-[LIVMT]-x-[LIVMT]-[GE]-x- 
[KR]-x-[LIVMFYW](2)-x-[LIVMFYW](3) 

Sequences known to belong to this class detected by the pattern ALL, except for emrA and 
yjcR. 

References: 

[1] Gilson L., Mahanty H.K., Kolter R., EMBO J. 9:3875-3884(1990). 

[2] Letoffe S., Delepelaire P., Wandersman C., EMBO J. 9:1375-1382(1990). 

[3] Stoddard G.W., Petzel J.P., van Belkum M.J., Kok J., McKay L.L., Appl. Environ. 

Microbiol. 58:1952-1961(1992). 

[4] Duong F., Lazdunski A., Cami B., Murgier M., Gene 121:47-54(1992). 
[5] Lewis K., Trends Biochem. Sci. 19:119-123(1994). 

852. (IBR) 

In Between Ring fingers 

The IBR (In Between Ring fingers) domain is found to occur between pairs of ring fingers 
(zf-C3HC4). The function of this domain is unknown. This domain has also been called the 
C6HC domain and DRIL (for double RING finger linked) domain [2]. 
Number of members: 25 



Reference No. 2750-942P 



698 

[1] Morett E, Bork P; Medline: 10366851 "A novel transactivation domain in parkin.'Trends 
Biochem Sci 1999;24:229-231. 

[2] van der Reijden BA, Erpelinck-Verschueren CA, Lowenberg B, Jansen JH; Medline: 
99349709 "TRIADs: a new class of proteins with a novel cysteine-rich signature." Protein 
Sci 1999;8:1557-1561. 

853. (IPPT) 
IFF transferase 

[1] Durand JM, Bjork GR, Kuwae A, Yoshikawa M, Sasakawa C; Medline: 97440126 "The 

modified nucleoside 2-methylthio-N6-isopentenyladenosine in tRNA of Shigella flexneri is 

required for expression of virulence genes." J Bacteriol 1997;179:5777-5782. 

[2] Boguta M, Hunter LA, Shen WC, Gillman EC, Martin NC, Hopper AK; Medline: 

94187700 "Subcellular locations of MOD5 proteins: mapping of sequences sufficient for 

targeting to mitochondria and demonstration that mitochondrial and nuclear isoforms 

commingle in the cytosol." Mol Cell Biol 1994;14:2298-2306. 

[3] Gillman EC, Slusher LB, Martin NC, Hopper AK; Medline: 91203856 "M0D5 

translation initiation sites determine N6-isopentenyladenosine modification of mitochondrial 

and cytoplasmic tRNA." Mol Cell Biol 1991;11:2382-2390. 

854. (KE2) 

KE2 family protein 

The function of members of this family is unknown, although they have been suggested to 
contain a DNA binding leucine zipper motif [2]. 

Number of members: 9 

[1] Ha H, Abe K, Artzt K; Medline: 92084131 "Primary structure of the embryo-expressed 
gene KE2 from the mouse H-2K region." Gene 1991;107:345-346. 
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[2] Shang HS, Wong SM, Tan HM, Wu M; Medline: 95129859 "YKE2, a yeast nuclear gene 
encoding a protein showing homology to mouse KE2 and containing a putative leucine- 
zipper motif." Gene 1994;151:197-201. 

5 

855. (Lipoprotein_6) 

Prokaryotic membrane lipoprotein lipid attachment site 

Cross-reference(s) PS00013; PROKAR_LIPOPROTEIN 

10 In prokaryotes, membrane lipoproteins are synthesized with a precursor signal peptide, 
which is cleaved by a specific lipoprotein signal peptidase (signal peptidase II). The 
peptidase recognizes a conserved sequence and cuts upstream of a cysteine residue to which 
a glyceride-fatty acid lipid is attached [1]. Some of the proteins known to undergo such 
processing currently include (for recent listings see [1,2,3]): 

15 - Major outer membrane lipoprotein (murein-lipoproteins) (gene Ipp). 

- Escherichia coli lipoprotein-28 (gene nlpA). 

- Escherichia coli lipoprotein-34 (gene nlpB). 

- Escherichia coli lipoprotein nlpC. 

- Escherichia coli lipoprotein nlpD. 

2 0 - Escherichia coli osmotically inducible lipoprotein B (gene osmB). 

- Escherichia coli osmotically inducible lipoprotein E (gene osmE). 

- Escherichia coli peptidoglycan-associated lipoprotein (gene pal). 

- Escherichia coli rare lipoproteins A and B (genes rplA and rplB). 

- Escherichia coli copper homeostasis protein cutF (or nlpE). 
25 - Escherichia coli plasmids traT proteins. 

- Escherichia coli Col plasmids lysis proteins. 

- A number of Bacillus beta-lactamases. 

- Bacillus subtilis periplasmic oligopeptide -binding protein (gene oppA). 

- Borrelia burgdorferi outer surface proteins A and B (genes ospA and ospB). 
30 - Borrelia hermsii variable major protein 21 (gene vmp21) and 7 (gene vmp7). 

- Chlamydia trachomatis outer membrane protein 3 (gene omp3). 

- Fibrobacter succinogenes endoglucanase cel-3. 

- Haemophilus influenzae proteins Pal and Pep. 
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- Klebsiella pullulunase (gene pulA). 

- Klebsiella pullulunase secretion protein pulS. 

- iVlycoplasma hyorhinis protein p37. 

- IVIycoplasma hyorhinis variant surface antigens A, B, and C (genes vlpABC). 
5 - Neisseria outer membrane protein H.8. 

- Pseudomonas aeruginosa lipopeptide (gene IppL). 

- Pseudomonas solanacearum endoglucanase egl. 

- Rhodopseudomonas viridis reaction center cytochrome subunit (gene cytC). 

- Rickettsia 17 Kd antigen. 

10 - Shigella flexneri invasion plasmid proteins mxiJ and mxiM. 

- Streptococcus pneumoniae oligopeptide transport protein A (gene amiA). 

- Treponema pallidium 34 Kd antigen. 

- Treponema pallidium membrane protein A (gene tmpA). 

- Vibrio harveyi chitobiase (gene chb). 

1 5 - Yersinia virulence plasmid protein yscJ. 

- Halocyanin from Natrobacterium pharaonis [4], a membrane associated copper-binding 
protein. This is the first archaebacterial protein known to be modified in such a fashion). 

From the precursor sequences of all these proteins, a consensus pattern and a set of rules 
2 0 to identify this type of post-translational modification were derived. 

Consensus pattern: {DERK}(6)-[LIVMFWSTAG](2)-[LIVMFYSTAGCO]-[AGS]-C [C is 
the lipid attachment site] Additional rules: 1) 

25 The cysteine must be between positions 15 and 35 of the sequence in consideration. 2) There 
must be at least one Lys or one Arg in the first seven positions of the sequence. Sequences 
known to belong to this class detected by the pattern ALL. Other sequence(s) detected in 
SWISS-PROT some 100 prokaryotic proteins. Some of them are not membrane lipoproteins, 
but at least half of them could be. 

30 

References 

[1] Hayashi S., Wu H.C., J. Bioenerg. Biomembr. 22:451-471(1990). 
[2] Klein P., Somorjai R.L., Lau P.C.K., Protein Eng. 2:15-20(1988). 
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[3] von Heijne G., Protein Eng. 2:531-534(1989). 

[4] Mattar S., Scharf B., Kent S.B.H., Rodewald K., Oesterhelt D., Engelhard M. J. Biol. 
Chem. 269:14939-14945(1994). 

856. (Lipoprotein_7) 
Adhesin lipoprotein 

This family consists of the p50 and variable adherence-associated antigen (Vaa) adhesins 
from Mycoplasma hominis. M. hominis is a mycoplasma associated with human urogenital 
diseases, pneumonia, and septic arthritis [1]. An adhesin is a cell surface molecule that 
mediates adhesion to other cells or to the surrounding surface or substrate. The Vaa antigen is 
a 50-kDa surface lipoprotein that has four tandem repetitive DNA sequences encoding a 
periodic peptide structure, and is highly immunogenic in the human host [1]. p50 is also a 50- 
kDa lipoprotein, having three repeats A,B and C, that may be a tetramer of 191-kDa in its 
native environment [2]. 

Number of members: 18 

[1] Zhang Q, Wise KS; Medline: 96294788 "Molecular basis of size and antigenic variation 
of a Mycoplasma hominis adhesin encoded by divergent vaa genes. " Infect Immun 
1996;64:2737-2744. 

[2] Henrich B, Kitzerow A, Feldmann RC, Schaal H, Hadding U; Medline: 97047675 
"Repetitive elements of the Mycoplasma hominis adhesin p50 can be differentiated by 
monoclonal antibodies." Infect Immun 1996;64:4027-4034. 

857. (MaoCJike) 
MaoC like domain 

The MaoC protein is found to share similarity with a wide variety of enzymes; estradiol 17 
beta-dehydrogenase 4, peroxisomal hydratase-dehydrogenase-epimerase, fatty acid synthase 
beta subunit. All these enzymes contain other domains. This domain is also present in the 
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NodN nodulation protein N. No specific function has been assigned to this region of any of 
these proteins. The maoC gene is part of a operon with maoA which is involved in the 
synthesis of monoamine oxidase [1]. 

Number of members: 46 

[1] Sugino H, Sasaki M, Azakami H, Yamashita M, Murooka Y Medline: 96235221 "A 
monoamine-regulated Klebsiella aerogenes operon containing the monoamine oxidase 
structural gene (maoA) and the maoC gene." J Bacteriol 1992;174:2485-2492. 

858. (MSP) 

Manganese-stabilizing protein / photosystem II polypeptide 

This family consists of the 33 KDa photosystem II polypeptide from the oxygen evolving 
complex (OEC) of plants and cyanobacteria. The protein is also known as the manganese- 
stabilizing protein as it is associated with the manganese complex of the OEC and may 
provide the ligands for the complex [1]. 

Number of members: 17 

[1] Philbrick JB, Zilinskas BA; Medline: 88334494 "Cloning, nucleotide sequence and 
mutational analysis of the gene encoding the Photosystem II manganese-stabilizing 
polypeptide of Synechocystis 6803." Mol Gen Genet 1988;212:418-425. 

859. (NAC) 

[1] Makarova KS, Aravind L, Galperin MY, Grishin NV, Tatusov RL, Wolf YI, Koonin EV; 
Medline: 99342100 "Comparative genomics of the Aichaea (Euryarchaeota): evolution of 
conserved protein families, the stable core, and the variable shell." Genome Res 1999;9:608- 
628. 
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Number of members: 27 

860. (Nop) 

5 Putative snoRNA binding domain 

This family consists of various Pre RNA processing ribonucleoproteins. The function of the 
aligned region is unknown however it may be a common RNA or snoRNA or Noplp binding 
domain. Nop5p (Nop58p) Swiss:Q12499 from yeast is the protein component of a 
1 0 ribonucleoprotein protein required for pre-18s rRNA processing and is suggested to function 
with Noplp in a snoRNA complex [1]. Nop56p Swiss:O00567 and Nop5p interact with 
Noplp and are required for ribosome biogenesis [2]. Prp31p Swiss:p49704 is required for 
pre-mRNA splicing in S. cerevisiae [3]. 

1 5 Number of members: 23 

[1] Wu P, Brockenbrough JS, Metcalfe AC, Chen S, Aris JP; Medline: 98298165 "Nop5p is a 
small nucleolar ribonucleoprotein component required for pre- 18 S rRNA processing in 
yeast." J Biol Chem 1998;273:16453-16463. 
2 0 [2] Gautier T, Berges T, Tollervey D, Hurt E;Medline: 8038777 "Nucleolar KKE/D repeat 

proteins Nop56p and Nop58p interact with Noplp and are required for ribosome biogenesis." 
Mol Cell Biol 1997;17:7088-7098. 

[3] Weidenhammer EM, Singh M, Ruiz-Noriega M, Woolford JL Jr; Medline: 96184869 
"The PRP31 gene encodes a novel protein required for pre-mRNA splicing in Saccharomyces 
25 cerevisiae." Nucleic Acids Res 1996;24:1164-1170. 

861. (Nramp) 

Natural resistance-associated macrophage protein 

30 

The natural resistance-associated macrophage protein (NRAMP) family consists of Nrampl, 
Nramp2, and yeast proteins Smfl and Smf2. The NRAMP family is a novel family of 
functional related proteins defined by a conserved hydrophobic core of ten transmembrane 
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domains [5]. This family of membrane proteins are divalent cation transporters. Nrampl is an 
integral membrane protein expressed exclusively in cells of the immune system and is 
recruited to the membrane of a phagosome upon phagocytosis [1]. By controlling divalent 
cation concentrations Nrampl may regulate the interphagosomal replication of bacteria [1]. 
5 Mutations in Nrampl may genetically predispose an individual to susceptibility to diseases 
including leprosy and tuberculosis conversely this might however provide protection form 
rheumatoid arthritis [1]. Nramp2 is a multiple divalent cation transporter for Fe2+, Mn2+ and 
Zn2+ amongst others it is expressed at high levels in the intestine; and is major transferrin- 
independent iron uptake system in mammals [1]. The yeast proteins Smfl and Smf2 may also 
1 0 transport divalent cations [3]. 

Number of members: 36 

[1] Govoni G, Gros P; Medline: 98383996 "Macrophage NRAMPl and its role in resistance 
15 to microbial infections." Inflamm Res 1998;47:277-284. 

[2] Agranoff DD, Krishna S Medline: 98294035 "Metal ion homeostasis and intracellular 
parasitism." Mol Microbiol 1998;28:403-412. 

[3] Pinner E, Gruenheid S, Raymond M, Gros P; Medline: 98030569 "Functional 
complementation of the yeast divalent cation transporter family SMF by NRAMP2, a 
20 member of the mammalian natural resistance- associated macrophage protein family." J Biol 
Chem 1997;272:28933-28938. 

[4] Cellier M, Belouchi A, Gros P; Medline: 96402487 "Resistance to intracellular infections: 
comparative genomic analysis of Nramp." Trends Genet 1996;12:201-204. 
[5] Cellier M, Prive G, Belouchi A, Kwan T, Rodrigues V, Chia W, Gros P; Medline: 
25 96036029 "Nramp defines a family of membrane proteins." Proc Natl Acad Sci U S A 
1995;92:10089-10093. 

862. (NTP_transf_2) 
3 0 Nucleotidyltransferase domain 

Members of this family belong to a large family of nucleotidyltransferases [1]. 
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Number of members: 83 

[1] Holm L, Sander C; Medline: 96005605 "DNA polymerase beta belongs to an ancient 
nucleotidyltransferase superfamily." Trends Biochem Sci 1995;20:345-347. 

5 

863. (Paramyxo_P) 
Paramyxovirus P phosphoprotein 

1 0 This family consists of paramyxovirus P phosphoprotein from sendai virus and human and 
bovine parainfluenza viruses. The P protein is an essential part of the viral RNA polymerase 
complex formed form the P and L proteins [1]. The exact role of the P protein in this complex 
in unknown but it is involved in multiple protein-protein interactions and binding the 
polymerase complex to the nucleocapsid or ribonucleoprotein template [1]. It also appears to 

15 be important for the proper folding of the L protein [1]. The paramyxoviruses have a 
negative sense ssRNA genome [1]. 

Number of members: 15 

20 [1] Bowman MC, Smallwood S, Moyer SA; Medline: 99329169 "Dissection of Individual 

Functions of the Sendai Virus Phosphoprotein in Transcription." J Virol 1999;73:6474-6483. 
[2] Matsuoka Y, Curran J, Pelet T, Kolakofsky D, Ray R, Compans RW; Medline: 91237868 
"The P gene of human parainfluenza virus type 1 encodes P and C proteins but not a 
cysteine-rich V protein." J Virol 1991;65:3406-3410. 

25 

864. (Patatin) 

This family consists of various patatin glycoproteins from plants. The patatin protein 
3 0 accounts for up to 40% of the total soluble protein in potato tubers [2]. Patatin is a storage 
protein but it also has the enzymatic activity of lipid acyl hydrolase, catalysing the cleavage 
of fatty acids from membrane lipids [2]. 
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Number of members: 21 

[1] Banfalvi Z, Kostyal Z, Barta E; Medline: 95107249 "Solanum brevidens possesses a non- 
sucrose-inducible patatin gene." Mol Gen Genet 1994:245:517-522. 
5 [2] Mignery GA, Pikaard CS, Park WD; Medline: 88226014 "Molecular characterization of 
the patatin muitigene family of potato." Gene 1988;62:27-44. 

865. (Pentapeptide_2) 
1 0 Pentapeptide repeats (8 copies) 

These repeats are found in many mycobacterial proteins. These repeats are most common in 
the PPE family of proteins, where they are found in the MPTR subfamily of PPE proteins. 
The function of these repeats is unknown. The repeat can be approximately described as 
15 XNXGX, where X can be any amino acid. These repeats are similar to Pentapeptide [1], 
however it is not clear if these two families are structurally related. 

Number of members: 362 

20 

[1] Bateman A, Murzin A, Teichmann SA; Medline: 98318059 "Structure and distribution of 
pentapeptide repeats in bacteria." Protein Sci 1998;7:1477-1480. 

[2] Cole ST, Brosch R, Parkhill J, Gamier T, Churcher C, Harris D, Gordon SV, Eiglmeier K, 
Gas S, Barry CE 3rd, Tekaia F, Badcock K, Basham D, Brown D, Chillingworth T, Connor 
25 R, Davies R, Devlin K, Feltwell T, Gentles S, Hamlin N, Holroyd S, Hornsby T, Jagels K, 
Barren BG; Medline: 98295987 "Deciphering the biology of Mycobacterium tuberculosis 
from the complete genome sequence." Nature 1998;393:537-544. 

3 0 866. (Feptidase_C13) 
Peptidase CI 3 family 
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This family of peptidases is known as the iiemoglobinase family because it contains a globin 
degrading enzyme from blood parasites Swiss:P42665. However relatives are found in plants 
and other organisms that have other functions. Members of this family are asparaginyl 
peptidases [1]. 

Number of members: 26 

[1] Chen JM, Dando PM, Rawlings ND, Brown MA, Young NE, Stevens RA, Hewitt E, 
Watts C, Barrett AJ; Medline: 97218252 "Cloning, isolation, and characterization of 
mammalian legumain, an asparaginyl endopeptidase." J Biol Chem 1997;272:8090-8098. 

867. (Pro_dh) 
Proline dehydrogenase 

Number of members: 25 

[1] Ling M, Allen SW, Wood JM; Medline: 95055736 "Sequence analysis identifies the 
proline dehydrogenase and delta 1- pyrroline-5-carboxylate dehydrogenase domains of the 
multifunctional Escherichia coli PutA protein." J Mol Biol 1994;243:950-956. 

868. (PsbP) 

This family consists of the 23 kDa subunit of oxygen evolving system of photosystem II or 
PsbP from various plants (where it is encoded by the nuclear genome) and Cyanobacteria. 
The 23 KDa PsbP protein is required for PSII to be fully operational in vivo, it increases the 
affinity of the water oxidation site for CI- and provides the conditions required for high 
affinity binding of Ca2+ [2]. 



Number of members: 25 
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[1] Rova EM, Mc Ewen B, Fredriksson PO, Styring S; Medline: 97067138 "Photoactivation 
and photoinhibition are competing in a mutant of Chlamydomonas reinhardtii laclving the 23- 
kDa extrinsic subunit of pliotosystem II." J Biol Chem 1996;271:28918-28924. 
[2] Kochhar A, Khurana JP, Tyagi AK; Medline: 97191538 "Nucleotide sequence of the 
psbP gene encoding precursor of 23-kDa polypeptide of oxygen-evolving complex in 
Arabidopsis thaliana and its expression in the wild-type and a constitutively 
photomorphogenic mutant." DNA Res 1996;3:277-285. 

869. (PUA) 

The PUA domain named after PseudoUridine synthase and Archaeosine transglycosylase, 
was detected in archaeal and eukaryotic pseudouridine synthases, archaeal archaeosine 
synthases, a family of predicted ATPases that may be involved in RNA modification, a 
family of predicted archaeal and bacterial rRNA methylases. Additionally, the PUA domain 
was detected in a family of eukaryotic proteins that also contain a domain homologous to the 
translation initiation factor elFl/SUIl; these proteins may comprise a novel type of 
translation factors. Unexpectedly, the PUA domain was detected also in bacterial and yeast 
glutamate kinases; this is compatible with the demonstrated role of these enzymes in the 
regulation of the expression of other genes [1]. It is predicted that the PUA domain is an 
RNA binding domain. 

Number of members: 48 

[1] Aravind L, Koonin EV; Medline: 99193178 "Novel predicted RNA-binding domains 
associated with the translation machinery." J Mol Evol 1999;48:291-302. 

870. (RFl) 
eRFl-like proteins 

Members of this family are peptide chain release factors. The eukaryotic Release Factor 1 
proteins (eRFls) are involved in termination of translation. The eRFl protein is functional for 
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all stop codons and appears to abolish read-through of these codons. This family also 
includes other proteins for which the precise molecular function is unknown. Many of them 
are from Archaebacteria. These proteins may also be involved in translation termination but 
this awaits experimental verification. Number of members: 25 

[1] Frolova L, Le Goff X, Rasmussen HH, Cheperegin S, Drugeon G, Kress M, Arman I, 
Haenni AL, Celis JE, Philippe M, et al; Medline: 95082951 "A highly conserved eukaryotic 
protein family possessing properties of polypeptide chain release factor" [see comments] 
Nature 1994;372:701-703. 

[2] Drugeon G, Jean-Jean O, Frolova L, Le Goff X, Philippe M, Kisselev L, Haenni AL; 
Medline: 97315314 "Eukaryotic release factor 1 (eRFl) abolishes readthrough and competes 
with suppressor tRNAs at all three termination codons in messenger RNA." Nucleic Acids 
Res 1997;25:2254-2258. 

871. (Ribosomal_L14e)Ribosomal protein L14 

This family includes the eukaryotic ribosomal protein L14. 
Number of members: 15 

872. (Ribosomal_S27) 
Ribosomal protein S27a 

This family of ribosomal proteins consists mainly of the 40S ribosomal protein S27a which is 
synthesized as a C-terminal extension of ubiquitin (CEP). The S27a domain compromises the 
C-terminal half of the protein. The synthesis of ribosomal proteins as extensions of ubiquitin 
promotes their incorporation into nascent ribosomes by a transient metabolic stabilization and 
is required for efficient ribosome biogenesis [3]. The ribosomal extension protein S27a 
contains a basic region that is proposed to form a zinc finger; its fusion gene is proposed as a 
mechanism to maintain a fixed ratio between ubiquitin necessary for degrading proteins and 
ribosomes a source of proteins [2]. 
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Number of members: 36 

873. (Spermine_synth) 
Spermine/spermidine synthase 

Spermine and spermidine are polyamines. This family includes spermidine synthase that 
catalyses the fifth (last) step in the biosynthesis of spermidine from arginine, and spermine 
synthase. 

Number of members: 39 

[1] Mezquita J, Pau M, Mezquita C; Medline: 97449308 "Characterization and expression of 
two chicken cDNAs encoding ubiquitin fused to ribosomal proteins of 52 and 80 amino 
acids." Gene 1997;195:313-319. 

[2] Redman KL, Rechsteiner M; Medline: 89181932 "Identification of the long ubiquitin 
extension as ribosomal protein S27a." Nature 1989;338:438-440. 

[3] Finley D, Bartel B, Varshavsky A; Medline: 89181925 "The tails of ubiquitin precursors 
are ribosomal proteins whose fusion to ubiquitin facilitates ribosome biogenesis." Nature 
1989;338:394-401. 

874. (Surp) 
Surp module 

[1] Denhez F, Lafyatis R; Medline: 94266805 "Conservation of regulated alternative splicing 
and identification of functional domains in vertebrate homologs to the Drosophila splicing 
regulator, suppressor-of-white-apricot." J Biol Chem 1994;269:16170-16179. 

This domain is also known as the SWAP domain. SWAP stands for Suppressor-of-White- 
APricot. It has been suggested that these domains may be RNA binding [1]. 



Number of members: 32 
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875. (TFIIE) 
TFIIE alpha subunit 

The general transcription factor TFIIE has an essential role in eukaryotic transcription 
initiation together with RNA polymerase II and other general factors. Human TFIIE consists 
of two subunits TFIIE-alpha Swiss:P29083 and TFIIE-beta Swiss:P29084 and joins the 
preinitiation complex after RNA polymerase II and TFIIF [1]. This family consists of the 
conserved amino terminal region of eukaryotic TFIIE-alpha [2] and proteins from 
archaebacteria that are presumed to be TFIIE-alpha subunits also Swiss:O29501 [3]. 

Number of members: 12 

[1] Ohkuma Y, Sumimoto H, Hoffmann A, Shimasaki S, Horikoshi M, Roeder RG; Medline: 
92065982 "Structural motifs and potential sigma homologies in the large subunit of human 
general transcription factor TFIIE." Nature 1991;354:398-401. 

[2] Ohkuma Y, Hashimoto S, Roeder RG, Horikoshi M; Medline: 93087200 Identification of 
two large subdomains in TFIIE-alpha on the basis of homology between Xenopus and human 
sequences. Nucleic Acids Res 1992;20:5838-5838. 

[3] Klenk HP, Clayton RA, Tomb JF, White O, Nelson KE, Ketchum KA, Dodson RJ, Gwinn 
M, Hickey EK, Peterson JD, Richardson DL, Kerlavage AR, Graham DE, Kyrpides NC, 
Fleischmann RD, Quackenbush J, Lee NH, Sutton GG, Gill S, Kirkness EF, Dougherty BA, 
McKenney K, Adams MD, Loftus B, Venter JC, et al; Medline: 98049343 "The complete 
genome sequence of the hyperthermophilic, sulphate- reducing archaeon Archaeoglobus 
fulgidus." Nature 1997;390:364-370. 

876. (Transglut_core) 

Cross-reference(s) PS00547; TRANSGLUTAMINASES 
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Transglutaminases (EC 2.3.2.13) (TGase) [1,2] are calcium-dependent enzymes that catalyze 
the cross-linking of proteins by promoting the formation of isopeptide bonds between the 
gamma-carboxyl group of a glutamine in one polypeptide chain and the epsilon-amino group 
of a lysine in a second polypeptide chain. TGases also catalyze the conjugation of polyamines 
to proteins. The best known transglutaminase is blood coagulation factor XIII, a plasma 
tetrameric protein composed of two catalytic A subunits and two non-catalytic B subunits. 
Factor XIII is responsible for cross-linking fibrin chains, thus stabilizing the fibrin clot. Other 
forms of transglutaminases are widely distributed in various organs, tissues and body fluids. 
Sequence data is available for the following forms of TGase: 

- Transglutaminase K (Tgase K), a membrane-bound enzyme found in mammalian epidermis 
and important for the formation of the cornified cell envelope (gene TGMl). 

- Tissue transglutaminase (TGase C), a monomeric ubiquitous enzyme located in the 
cytoplasm (gene TGM2). 

- Transglutaminase 3, responsible for the later stages of cell envelope formation in the 
epidermis and the hair follicle (gene TGM3). 

- Transglutaminase 4 (gene TGM4). 

A conserved cysteine is known to be involved in the catalytic mechanism of TGases. The 
erythrocyte membrane band 4.2 protein, which probably plays an important role in regulating 
the shape of erythrocytes and their mechanical properties, is evolutionary related to TGases. 
However the active site cysteine is substituted by an alanine and the 4.2 protein does not 
show TGase activity. 

Consensus pattern:[GT]-Q-[CA]-W-V-x-[SA]-[GA]-[IVT]-x(2)-T-x-[LMSC]-R-[CSA]- 
[LV]-G [The first C is the active site residue] Sequences known to belong to this class 
detected by the patternALL. Other sequence(s) detected in SWISS-PROTNONE. 

[ 1] Ichinose A., Bottenus R.E., Davie E.W. J. Biol. Chem. 265:13411-13414(1990). 
[ 2] Greenberg C.S., Birckbichler P.J., Rice R.H. FASEB J. 5:3071-3077(1991). 
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877. (TruB_N) 

TruB family pseudouridylate synthase (N terminal domain) 

Members of this family are involved in modifying bases in RNA molecules. They carry out 
the conversion of uracil bases to pseudouridine. This family includes TruB, a pseudouridylate 
synthase that specifically converts uracil 55 to pseudouridine in most tRNAs. This family 
also includes CbfSp that modifies rRNA [2]. 

Number of members: 33 

[1] Nurse K, Wrzesinski J, Bakin A, Lane BG, Ofengand J; Medline: 96079944 "Purification, 
cloning, and properties of the tRNA psi 55 synthase from Escherichia coli." RNA 
1995;1:102-112. 

[2] Lafontaine DU, Bousquet-Antonelli C, Henry Y, Caizergues-Ferrer M, Tollervey D; 
Medline: 98139521 "The box H + ACA snoRNAs carry Cbf5p, the putative rRNA 
pseudouridine synthase." Genes Dev 1998;12:527-537. 

878. (UDPGP) 

UTP--glucose-l -phosphate uridyl yltransferase 

This family consists of UTP-glucose-1 -phosphate uridylyltransferases, EC:2.7.7.9. Also 
known as UDP-glucose pyrophosphorylase (UDPGP) and Glucose-l-phosphate 
uridylyltransferase. UTP-glucose-l-phosphate uridylyltransferase catalyses the 
interconversion of MgUTP + glucose-l-phosphate and UDP-glucose + MgPPi [1]. UDP- 
glucose is an important intermediate in mammalian carbohydrate interconversion involved in 
various metabolic roles depending on tissue type [1]. In Dictyostelium (slime mold) mutants 
in this enzyme abort the development cycle [2]. Also within the family is UDP-N- 
acetylglucosamine Swiss:Q16222 or AGXl [3] and two hypothetical proteins from Borrelia 
burgdorferi the lyme disease spirochaete Swiss:051893 and Swiss:O51036. 



Number of members: 18 
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[1] Duggleby RG, Chao YC, Huang JG, Peng HL, Chang HY; Medline: 96202932 "Sequence 
differences between human muscle and liver cDNAs for UDPglucose pyrophosphorylase and 
kinetic properties of the recombinant enzymes expressed in Escherichia coli." Eur J Biochem 
1996;235:173-179. 

[2] Ragheb JA, Dottin RP; Medline: 87231075 "Structure and sequence of a UDP glucose 
pyrophosphorylase gene of Dictyostelium discoideum." Nucleic Acids Res 1987;15:3891- 
3906. 

[3] Mio T, Yabe T, Arisawa M, Yamada-Okabe H; Medline: 98269105 "The eukaryotic 
UDP-N-acetylglucosamine pyrophosphorylases. Gene cloning, protein expression, and 
catalytic mechanism. J Biol Chem 1998;273:14392-14397. 

879. (UPF004) 

Uncharacterized protein family UPF0044 signature 
Cross-reference(s) PS01301; UPF0044 

The following uncharacterized proteins have been shown [1] to be highlysimilar: 

- Bacillus subtilis hypothetical protein yqel. 

- Escherichia coli hypothetical protein yhbY and HI1333, the corresponding Haemophilus 
influenzae protein. 

- Methanococcus jannaschii hypothetical protein MJ0652. 

These are small proteins of 10 to 15 Kd. They can be picked up in the database 
by the following pattern. This pattern is located in the N-terminal part of 
these proteins. 

Consensus pattern: L-[ST]-x(3)-K-x(3)-[KR]-[SGA]-x-[GA]-H-x-L-x-P-[LIV]-x(2)- [LIV]- 
[GA]-x(2)-G Sequences known to belong to this class detected by the patternALL. Other 
sequence(s) detected in SWISS-PROTNONE. 

880. (zf-A20) 
A20-like zinc finger 

A20- (an inhibitor of cell death)-like zinc fingers. The zinc 
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finger mediates self-association in A20. These fingers also 
mediate IL-1 -induced NF-kappa B activation. 

Number of members: 22 

[1] Heyninck K, Beyaert R; Medline: 99126071 "The cytokine-inducible zinc finger protein 
A20 inhibits IL-l-induced NF- kappaB activation at the level of TRAF6. FEES Lett 
1999;442:147-150. 

[2] De Valck D, Heyninck K, Van Criekinge W, Contreras R,Beyaert R, Fiers W; Medline: 
96390831 "A20, an inhibitor of cell death, self-associates by its 
zinc finger domain." FEES Lett 1996;384:61-64. 

[3] Song HY, Rothe M, Goeddel DV; Medline: 96270609 "The tumor necrosis factor- 
inducible zinc finger protein A20 interacts with TRAF1/TRAF2 and inhibits NF-kappaB 
activation. Proc Natl Acad Sci U S A 1996;93:6721-6725. 

[4] Opipari AW Jr, Boguski MS, Dixit VM; Medline: 90368626 "The A20 cDNA induced by 
tumor necrosis factor alpha encodes a novel type of zinc finger protein." J Eiol Chem 
1990;265:14705-14708. 

881. (zf-PARP) 

Poly(ADP-ribose) polymerase zinc finger domain 

Cross-reference(s) PS00347; PARP_ZN_FINGER_1 PS50064; PARP_ZN_FINGER_2 

Poly(ADP-ribose) polymerase (EC 2.4.2.30) (PARP) [1,2] is a eukaryotic enzyme that 
catalyzes the covalent attachment of ADP-ribose units from NAD(+) to various nuclear 
acceptor proteins. This post-translational modification of nuclear proteins is dependent 
on DNA. It appears to be involved in the regulation of various important cellular 
processes such as differentiation, proliferation and tumor transformation as well as in the 
regulation of the molecular events involved in the recovery of the cell from DNA damage. 
Structurally, PARP, about 1000 amino-acids residues long, consists of three distinct 
domains: an N-terminal zinc-dependent DNA-binding domain, a central automodification 
domain and a C-terminal NAD-binding domain. The DNA-binding region contains a pair of 
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zinc finger domains which have been shown to bind DNA in a zinc-dependent manner. The 
zinc finger domains of PARP seem to bind specifically to single-stranded DNA. DNAligase 
III [3] contains, in its N-terminal section, a single copy of a zinc finger highly similar to 
those of PARP. 

Consensus pattern: C-[KR]-x-C-x(3)-I-x-K-x(3)-[RG]-x(16,18)-W-[FYH]-H-x(2)-C [The 
three C's and the H are zinc ligands] Sequences known to belong to this class detected by the 
patternALL. Other sequence(s) detected in SWISS-PROTNONE. Sequences known to 
belong to this class detected by the profile ALL. Other sequence(s) detected in SWISS- 
PROTNONE. 

Note: This documentation entry is linked to both signature patterns and a profile. As the 
profile is much more sensitive than the patterns, you should use it if you have access to the 
necessary software tools to do so. 

[ 1] Althaus F.R., Richter C.R. Mol. Biol. Biochem. Biophys. 37:1-126(1987). 

[ 2] de Murcia G., Menissier de Murcia J. Trends Biochem. Sci. 19:172-176(1994). 

[ 3] Wei Y.-R, Robins P., Carter K., Caldecott K., Pappin D.J.C., Yu G.-L., Wang R.-P., 

Shell B.K., Nash R.A., Schar P., Barnes D.E., Haseltine W.A., Lindahl T. Mol. Cell. Biol. 

15:3206-3216(1995). 

882. Adenylylsulfate kinase (APS_kinase) 

Enzyme that catalyses the phosphorylation of adenylylsulfate to 3'-phosphoadenylylsulfate. 
This domain contains an ATP binding P-loop motif. Number of members: 34 

[1] MacRae IJ, Rose AB, Segel IH; Medline; 99003196 "Adenosine 5'-phosphosulfate kinase 
from Penicillium chrysogenum. site- directed mutagenesis at putative phosphoryl-accepting 
and ATP P-loop residues. J Biol Chem 1998;273:28583-28589. 

883. DNA polymerase family B signature DNA_POLYMERASE_B (DNA_pol_B) 

Replicative DNA polymerases (EC 2.7.7.7) are the key enzymes catalyzing the 
accurate replication of DNA. They require either a small RNA molecule or a protein as a 
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primer for the de novo synthesis of a DNA chain. On the basis of sequence similarity, a 
number of DNA polymerases have been grouped [1 to 7] under the designation of DNA 
polymerase family B. These are: 

- Higher eukaryotes polymerases alpha. 

- Higher eukaryotes polymerases delta. 

- Yeast polymerase I/alpha (gene POLl), polymerase Il/epsilon (gene P0L2), polymerase 
Ill/delta (gene POLS) and polymerase REV3. 

- Escherichia coli polymerase 11 (gene dinA or polB). 

- Archaebacterial polymerases. 

- Polymerases of viruses from the herpesviridae family. 

- Polymerases from Adenoviruses. 

- Polymerases from Baculoviruses. 

- Polymerases from Chlorella viruses. 

- Polymerases from Poxviruses. 

- Bacteriophage T4 polymerase. 

- Podoviridae bacteriophages Phi-29, M2 and PZA polymerase. 

- Tectiviridae bacteriophage PRDl polymerase. 

- Polymerases encoded on mitochondrial linear DNA plasmids in various fungi and plants 
(Kluyveromyces lactis pGKLl and pGKL2, Agaricus bitorquis pEM, Ascobolus immersus 
pAI2, Claviceps purpurea pCLKl, Neurospora Kalilo and Maranhar, maize S-1, etc). 

Six regions of similarity (numbered from I to VI) are found in all or a subset of the above 
polymerases. The most conserved region (I) includes a conserved tetrapeptide with two 
aspartate residues. Its function is not yet known. However, it has been suggested [3] that it 
may be involved in binding a magnesium ion. This conserved region was selected as a 
signature for this family of DNA polymerases. 

Consensus pattern [YA]-[GLIVMSTAC]-D-T-D-[SG]-[LIVMFTC]-x-[LIVMSTAC] 
Sequences known to belong to this class detected by the patternALL, except for yeast 
polymerase Il/epsilon, Agaricus bitorquis pEM and Sulfolobus solfataricus polymerase II. 

[ 1] Jung G., Leavitt M.C., Hsieh J.-C., Ito J. Proc. Natl. Acad. Sci. U.S.A. 84:8287- 
8291(1987). 
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[ 2] Bernad A., Zaballos A., Salas M., Blanco L. EMBO J. 6:4219-4225(1987). 

[ 3] Argos P. Nucleic Acids Res. 16:9909-9916(1988). 

[ 4] Wang T.S.-F., Wong S.W., Korn D. FASEB J. 3:14-21(1989). 

[ 5] Delarue M., Poch O., Todro N., Moras D., Argos P. Protein Eng. 3:461-467(1990). 

[ 6] Ito J., Braithwaite D.K. Nucleic Acids Res. 19:4045-4057(1991). 

[ 7] Braithwaite D.K., Ito J. Nucleic Acids Res. 21:787-802(1993). 

884. DNA polymerase family X signature - DNA_POLYMERASE_X (DNA_polymeraseX) 

DNA polymerases (EC 2.7.7.7) can be classified, on the basis of sequence similarity [1], into 
at least four different groups: A, B, C and X. DNA polymerases that belong to family X are 
listed below [2]: 

- Vertebrate polymerase beta, involved in DNA repair. 

- Yeast polymerase IV (P0L4) [3], an enzyme with similar characteristics to that of the 
mammalian polymerase beta. 

- Terminal deoxynucleotidyltransferase (TdT) (EC 2.7.7.31). TdT catalyzes the elongation of 
polydeoxynucleotide chains by terminal addition. One of the functions of this enzyme is the 
addition of nucleotides at the junction of rearranged Ig heavy chain and T cell receptor gene 
segments during the maturation of B and T cells. 

- African Swine Fever virus protein 0174L [4]. 

- Fission yeast hypothetical protein SpAC2F7.06c. 

These enzymes are small (about 40 Kd) compared with other polymerases and their reaction 
mechanism operates via a distributive mode, i.e. they dissociate from the template -primer 
after addition of each nucleotide. 

As a signature pattern for this family of DNA polymerases, a highly conserved region that 
contains a conserved arginine and two conserved aspartic acid residues were selected. The 
latter together with the arginine have been shown [5] to be involved in primer binding in 
polymerase beta. 
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Consensus pattern G-[SG]-[LFY]-x-R-[GE]-x(3)-[SGCL]-x-D-[LIVM]-D- [LIVMFY](3)- 
x(2)-[SAP] Sequences known to belong to this class detected by the patternALL. 

[ 1] Ito J., Braithwaite D.K. Nucleic Acids Res. 19:4045-4057(1991). 

[ 2] Matsukage A., Nishikawa K., Ooi T., Seto Y., Yamaguchi M. J. Biol. Chem. 262:8960- 

8962(1987). 

[ 3] Prasad R., Widen S.G., Singhal R.K., Watkins J., Prakash L., Wilson S.H. Nucleic Acids 
Res. 21:5301-5307(1993). 

[ 4] Yanez R.J., Rodriguez J.M., Nogal M.L., Yuste L., Enriquez C, Rodriguez J.F., Vinuela 
E. Virology 208:249-278(1995). 

[ 5] Date T., Yamamoto S., Tanihara K., Nishimoto Y., Matsukage A. Biochemistry 30:5286- 
5292(1991). 

885. DUF14 - Domain of unknown function 

This domain is found in glutamate synthase, tungsten formylmethanofuran dehydrogenase 
subunit c (FwdC) and molybdenum formylmethanofuran dehydrogenase subunit c (FmdC). 
It has no known function. Number of members: 52 

[1] Hochheimer A, Hedderich R, Thauer RK; Medline: 99035764. "The formylmethanofuran 
dehydrogenase isoenzymes in Methanobacterium wolfei and Methanobacterium 
thermoautotrophicum: induction of the molybdenum isoenzyme by molybdate and 
constitutive synthesis of the tungsten isoenzyme." Arch Microbiol 1998;170:389-393. 

886. DUFIS-Domain of unknown function 

This domain of unknown function is found in several C. elegans proteins. The domain is 120 
amino acids long and rich in cysteine residues. There are 16 conserved cysteine positions in 
the domain. Number of members: 34 

887. DUF27-Domain of unknown function 

This domain is found in a number of otherwise unrelated proteins. This domain is found at 
the C-terminus of the macro-H2A histone protein Swiss :Q02874. This domain is found in 
the non-structural proteins of several types of ssRNA viruses such as NSF2 from alphaviruses 
Swiss:P03317. This domain is also found on its own in a family of proteins from bacteria 
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Swiss:P75918, archaebacteria Swiss:059182 and eukaryotes Swiss:Q17432, suggesting that 
it is involved in an important and ubiquitous cellular process. Number of members: 66 

888. DUF37-Domain of unknown function 

This domain is found in short (70 amino acid) hypothetical proteins from various bacteria. 
The domain contains three conserved cysteine residues. Swiss: Q44066 from Aeromonas 
hydrophila has been found to have hemolytic activity (unpublished). Number of members: 
19 

889. EGF-like domain signatures. (EGF-like) 

A sequence of about thirty to forty amino-acid residues long found in the sequence of 
epidermal growth factor (EGF) has been shown [1 to 6] to be present, in a more or less 
conserved form, in a large number of other, mostly animal proteins. The proteins currently 
known to contain one or more copies of an EGF-like pattern are listed below. 

- Adipocyte differentiation inhibitor (gene PREF-1) from mouse (6 copies). 

- Agrin, a basal lamina protein that causes the aggregation of acetylcholine receptors on 
cultured muscle fibers (4 copies). 

- Amphiregulin, a growth factor (1 copy). 

- Betacellulin, a growth factor (1 copy). 

- Blastula proteins BPIO and Span from sea urchin which are thought to be involved in 
pattern formation (1 copy). 

- BM86, a glycoprotein antigen of cattle tick (7 copies). 

- Bone morphogenic protein 1 (BMP-1), a protein which induces cartilage and bone 
formation and which expresses metalloendopeptidase activity (1-2 copies). Homologous 
proteins are found in sea urchin - suBMP (1 copy) - and in Drosophila - the dorsal-ventral 
patterning protein tolloid (2 copies). 

- Caenorhabditis elegans developmental proteins lin-12 (13 copies) and glp-1 (10 copies). 

- Caenorhabditis elegans APX-1 protein, a patterning protein (4.5 copies). 

- Calcium-dependent serine proteinase (CASP) which degrades the extracellular matrix 
proteins type I and IV collagen and fibronectin (1 copy). 

- Cartilage matrix protein CMP (1 copy). 

- Cartilage oligomeric matrix protein COMP (4 copies). 

- Cell surface antigen 114/AlO (3 copies). 
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- Cell surface glycoprotein complex transmembrane subunit ASGP-2 from rat (2 copies). 

- Coagulation associated proteins C, Z (2 copies) and S (4 copies). 

- Coagulation factors VII, IX, X and XII (2 copies). 

- Complement Clr components (1 copy). 

- Complement Cls components (1 copy). 

- Complement-activating component of Ra-reactive factor (RARF) (1 copy). 

- Complement components C6, C7, C8 alpha and beta chains, and C9 (1 copy). 

- Crumbs, an epithelial development protein from Drosophila (29 copies). 

- Epidermal growth factor precursor (7-9 copies). 

- Exogastrula-inducing peptides A, C, D and X from sea urchin (1 copy). 

- Fat protein, a Drosophila cadherin-related tumor suppressor (5 copies). 

- Fetal antigen 1, a probable neuroendocrine differentiation protein, which is derived from 
the delta-like protein (DLK) (6 copies). 

- Fibrillin 1 (47 copies) and fibrillin 2 (14 copies). 

- Fibropellins lA (21 copies), IB (13 copies), IC (8 copies), II (4 copies) and III (8 copies) 
from the apical lamina - a component of the extracellular matrix - of sea urchin. 

- Fibulin-1 and -2, two extracellular matrix proteins (9-11 copies). 

- Giant-lens protein (protein Argos), which regulates cell determination and axon guidance in 
the Drosophila eye (1 copy). 

- Growth factor-related proteins from various poxviruses (1 copy). 

- Gurken protein, a Drosophila developmental protein (1 copy). 

- Heparin-binding EGF-like growth factor (HB-EGF), transforming growth factor alpha 
(TGF-alpha), growth factors Lin-3 and Spitz (1 copy); the precursors are membrane proteins, 
the mature form is located extracellular. 

- Hepatocyte growth factor (HGF) activator (EC 3.4.21.-) (2 copies). 

- LDL and VLDL receptors, which bind and transport low-density lipoproteins and very low- 
density lipoproteins (3 copies). 

- LDL receptor-related protein (LRP), which may act as a receptor for endocytosis of 
extracellular ligands (22 copies). 

- Leucocyte antigen CD97 (3 copies), cell surface glycoprotein EMRl (6 copies) and cell 
surface glycoprotein F4/80 (7 copies). 

- Limulus clotting factor C, which is involved in hemostasis and host defense mechanisms in 
Japanese horseshoe crab (1 copy). 
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- Meprin A alpha subunit, a mammalian membrane -bound endopeptidase (1 copy). 

- Milk fat globule-EGF factor 8 (MFG-E8) from mouse (2 copies). 

- Neuregulin GGF-I and GGF-II, two human glial growth factors (1 copy). 

- Neurexins from mammals (3 copies). 

- Neurogenic proteins Notch, Xotch and the human homolog Tan-1 (36 copies), Delta (9 
copies) and the similar differentiation proteins Lag-2 from Caenorhabditis elegans (2 copies), 
Serrate (14 copies) and Slit (7 copies) from Drosophila. 

- Nidogen (also called entactin), a basement membrane protein from chordates (2-6 copies). 

- Ookinete surface proteins (24 Kd, 25 Kd, 28 Kd) from Plasmodium (4 copies). 

- Pancreatic secretory granule membrane major glycoprotein GP2 (1 copy). 

- Perforin, which lyses non-specifically a variety of target cells (1 copy). 

- Proteoglycans aggrecan (1 copy), versican (2 copies), perlecan (at least 2 copies), brevican 
(1 copy) and chondroitin sulfate proteoglycan (gene PG-M) (2 copies). 

- Prostaglandin G/H synthase 1 and 2 (EC 1.14.99.1) (1 copy), which is found in the 
endoplasmatic reticulum. 

- Sl-5, a human extracellular protein whose ultimate activity is probably modulated by the 
environment (5 copies). 

- Schwannoma-derived growth factor (SDGF), an autocrine growth factor as well as a 
mitogen for different target cells (1 copy). 

- Selectins. Cell adhesion proteins such as ELAM-1 (E-selectin), GMP-140 (P-selectin), or 
the lymph-node homing receptor (L-selectin) (1 copy). 

- Serine/threonine-protein kinase homolog (gene Pro25) from Arabidopsis thaliana, which 
may be involved in assembly or regulation of light-harvesting chlorophyll A/B protein (2 
copies). 

- Sperm-egg fusion proteins PH-30 alpha and beta from guinea pig (1 copy). 

- Stromal cell derived protein-1 (SCP-1) from mouse (6 copies). 

- TDGF-1, human teratocarcinoma-derived growth factor 1 (1 copy). 

- Tenascin (or neuronectin), an extracellular matrix protein from mammals (14.5 copies), 
chicken (TEN-A) (13.5 copies) and the related proteins human tenascin-X (18 copies) and 
tenascin-like proteins TEN-A and TEN-M from Drosophila (8 copies). 

- Thrombomodulin (fetomodulin), which together with thrombin activates protein C (6 
copies). 
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- Thrombospondin 1, 2 (3 copies), 3 and 4 (4 copies), adhesive glycoproteins that mediate 
cell-to-cell and cell-to-matrix interactions. 

- Thyroid peroxidase 1 and 2 (EC 1.11.1.8) from human (1 copy). 

- Transforming growth factor beta-1 binding protein (TGF-Bl-BP) (16 or 18 copies). 

- Tyrosine-protein kinase receptors Tek and Tie (EC 2.7.1.112) (3 copies). 

- Urokinase-type plasminogen activator (EC 3.4.21.73) (UFA) and tissue plasminogen 
activator (EC 3.4.21.68) (TP A) (1 copy). 

- Uromodulin (Tamm-horsfall urinary glycoprotein) (THP) (3 copies). 

- Vitamin K-dependent anticoagulants protein C (2 copies) and protein S (4 copies) and the 
similar protein Z, a single-chain plasma glycoprotein of unknown function (2 copies). 

- 63 Kd sperm flagellar membrane protein from sea urchin (3 copies). 

- 93 Kd protein (gene nel) from chicken (5 copies). 

- Hypothetical 337.6 Kd protein T20G5.3 from Caenorhabditis elegans (44 copies). 

The functional significance of EOF domains in what appear to be unrelated proteins is not yet 
clear. However, a common feature is that these repeats are found in the extracellular domain 
of membrane-bound proteins or in proteins known to be secreted (exception: prostaglandin 
G/H synthase). The EGF domain includes six cysteine residues which have been shown (in 
EGF) to be involved in disulfide bonds. The main structure is a two-stranded beta-sheet 
followed by a loop to a C-terminal short two-stranded sheet. Subdomains between the 
conserved cysteines strongly vary in length as shown in the following schematic 
representation of the EGF-like domain: 

+ + + + I 1 I 

I x(4)-C-x(0,48)-C-x(3,12)-C-x(l,70)-C-x(l,6)-C-x(2)-G-a-x(0,21)-G-x(2)-C-x | 



'C: conserved cysteine involved in a disulfide bond. 

'G': often conserved glycine 

'a': often conserved aromatic amino acid 

'*': position of both patterns. 

'x': any residue 
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The region between the 5th and 6th cysteine contains two conserved glycines of which at 
least one is present in most EGF-like domains. Two patterns were created for this domain, 
each including one of these C-terminal conserved glycine residues. 

Consensus pattern: C-x-C-x(5)-G-x(2)-C [The 3 Cs are involved in disulfide bonds] 
Sequences known to belong to this class detected by the pattern A majority, but not those that 
have very long or very short regions between the last 3 conserved cysteines of their EGF-like 
domain(s). Other sequence(s) detected in SWISS-PROT87 proteins, of which 27 can be 
considered as possible candidates. 

Consensus pattern: C-x-C-x(2)-[GP]-[FYW]-x(4,8)-C [The three Cs are involved in disulfide 
bonds]Sequences known to belong to this class detected by the patternA majority, but not 
those that have very long or very short regions between the last 3 conserved cysteines of their 
EGF-like domain(s). Other sequence(s) detected in SWISS-PROT83 proteins, of which 49 
can be considered as possible candidates. Note The beta chain of the integrin family of 
proteins contains 2 cysteine- rich repeats which were said to be dissimilar with the EGF 
pattern [7]. 

Note Laminin EGF-like repeats (see <PDOC00961>) are longer than the average EGF 
module and contain a further disulfide bond C-terminal of the EGF-like region. Perlecan and 
agrin contain both EGF-like domains and laminin-type EGF-like domains. Note the pattern 
do not detect all of the repeats of proteins with multiple EGF-like repeats. Note see 
<PDOC00913> for an entry describing specifically the subset of EGF- like domains that bind 
calcium. 

[ 1] Davis C.G. New Biol. 2:410-419(1990). 

[ 2] Blomquist M.C., Hunt L.T., Barker W.C. Proc. Natl. Acad. Sci. U.S.A. 81:7363- 
7367(1984). 

[ 3] Barker W.C, Johnson G.C, Hunt L.T., George D.G. Protein Nucl. Acid Enz. 29:54- 
68(1986). 

[ 4] Doolittle R.F., Feng D.F., Johnson M.S. Nature 307:558-560(1984). 

[ 5] Appella E., Weber I.T., Blasi F. FEES Lett. 231:1-4(1988). 

[ 6] Campbell I.D., Bork P. Curr. Opin. Struct. Biol. 3:385-392(1993). 
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[ 7] Tamkun J.W., DeSimone D.W., Fonda D., Patel R.S., Buck C, Horwitz A.F., Hynes 
R.O. Cell 46:271-282(1986). 

890. Haml family (Hamlpjike) 

This family consists of the HAMl protein Swiss:P47119 and hypothetical archaeal bacterial 
and C. elegans proteins. HAMl controls 6-N-hydroxylaminopurine (HAP) sensitivity and 
mutagenesis in S. cerevisiae Swiss:P47119 [1]. The HAMl protein protects the cell from 
HAP, either on the level of deoxynucleoside triphosphate or the DNA level by a yet 
unidentified set of reactions [1]. Number of members: 19 

[1] Noskov VN, Staak K, Shcherbakova PV, Kozmin SG, Negishi K, Ono BC, Hayatsu H, 
Pavlov YI; Medline: 96381244 "HAMl, the gene controlling 6-N-hydroxylaminopurine 
sensitivity and mutagenesis in the yeast Saccharomyces cerevisiae." Yeast 1996;12:17-29. 

891. (HC03_cotransp) 

Anion exchange is a cellular transport function which contributes to the regulation of cell pH 
and volume. Anion exchangers are a family of functionally related proteins that contributes to 
these properties by maintaining the intracellular level of the two principal anions: chloride 
and HC03-. The best characterized anion exchanger is the band 3 protein [1], which is an 
erythrocyte anion exchange membrane glycoprotein. Band 3 is a protein of about 900 amino 
acids which consists of a cytoplasmic N-terminal domain of about 400 residues and an 
hydrophobic C-terminal section of about 500 residues that contains at least ten 
transmembrane regions. The cytoplasmic domain provides binding sites for cytoskeletal 
proteins, while the integral membrane domain is responsible for anion transport. Band 3 
protein is specific to erythroid cells, at least two other proteins [2] structurally and 
functionally related to band 3, are found in nonerythroid tissues: 

- AE2 (or B3 related protein; B3RP), a protein of 1200 residues, which seems to be present 
in a variety of cell types including lymphoid, kidney, and choroid plexus. 

- AE3, a protein of 1200 residues, which is specific to neurons. 

Structurally AE2 and AE3 are very similar to band 3, the main difference being an extension 
of some 300 residues of the N-terminal domain in AE2 and AE3. 
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Two signature patterns were developed for these proteins. The first pattern is based on a 
conserved stretch of sequence that contains four clustered positive charged residues and 
which is located at the C-terminal extremity of the cytoplasmic domain, just before the first 
transmembrane segment from the integral domain. The second pattern is based on the 
perfectly conserved sequence of the fifth transmembrane segment; this segment contains a 
lysine, which is the covalent binding site for the isothiocyanate group of DIDS, an inhibitor 
of anion exchange. 

Consensus pattern F-G-G-[LIVM](2)-[KR]-D-[LIVM]-[RK]-R-R-Y Sequences known to 
belong to this class detected by the pattern ALL. 

Consensus pattern [FI]-L-I-S-L-I-F-I-Y-E-T-F-x-K-L Sequences known to belong to this 
class detected by the pattern ALL. 

[ 1] Jay D., Cantley L. Annu. Rev. Biochem. 55:511-538(1986). 
[ 2] Reithmeier R.A.F. Curr. Opin. Struct. Biol. 3:515-523(1993). 



892. ATP phosphoribosyltransferase signature (HisG) 

ATP phosphoribosyltransferase (EC 2.4.2.17) is the enzyme that catalyzes the first step in the 
biosynthesis of histidine in bacteria, fungi and plants. It is a protein of about 23 to 32 Kd. As 
a signature pattern a region located in the C-terminal part of this enzyme was selected. 

Consensus pattern E-x(5)-G-x-[SAG]-x(2)-[IV]-x-D-[LIV]-x(2)-[ST]-G-x-T-[LM] 
Sequences known to belong to this class detected by the pattern ALL. 



893. HNH endonuclease (HNH) 
Number of members: 56 

[1] Shub DA, Goodrich-Blair H, Eddy SR; Medline: 95117127 "Amino acid sequence motif 
of group I intron endonucleases is conserved in open reading frames of group II introns." 
Trends Biochem Sci 1994;19:402-404. 
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[2] Dalgaard JZ, Klar AJ, Moser MJ, Holley WR, Chatterjee A, Mian IS; Medline: 98026854 
"Statistical modeling and analysis of the LAGLIDADG family of site- specific endonucleases 
and identification of an intein that encodes a site-specific endonuclease of the HNH family." 
Nucleic Acids Res 1997;25:4626-4638. 

[3] Gorbalenya AE; Medline: 95004046 "Self-splicing group I and group II introns encode 
homologous (putative) DNA endonucleases of a new family." Protein Sci 1994;3:1117-1120. 

894. NEUROHYPOPHYS_HORM (hormoneS) 

Oxytocin (or ocytocin) and vasopressin [1] are small (nine amino acid residues), structurally 
and functionally related neurohypophysial peptide hormones. Oxytocin causes contraction of 
the smooth muscle of the uterus and of the mammary gland while vasopressin has a direct 
antidiuretic action on the kidney and also causes vasoconstriction of the peripheral vessels. 
Like the majority of active peptides, both hormones are synthesized as larger protein 
precursors that are enzymatically converted to their mature forms. Peptides belonging to this 
family are also found in birds, fish, reptiles and amphibians (mesotocin, isotocin, valitocin, 
glumitocin, aspargtocin, vasotocin, seritocin, asvatocin, phasvatocin), in worms (annetocin), 
octopi (cephalotocin), locust (locupressin or neuropeptide F1/F2) and in molluscs 
(conopressins G and S) [2]. The pattern developed to detect this category of peptides spans 
their entire sequence and includes four invariant amino acid residues. 

Consensus pattern C-[LIFY](2)-x-N-[CS]-P-x-G [The two Cs are linked by a disulfide 
bond]. Sequences known to belong to this class detected by the pattern ALL. 

[ 1] Acher R., Chauvet J. Biochimie 70:1197-1207(1988). 

[ 2] Chauvet J., Michel G., Ouedraogo Y., Chou J., Chait B.T., Acher R. Int. J. Pept. Protein 
Res. 45:482-487(1995). 

895. 7,8-dihydro-6-hydroxymethylpterin-pyrophosphokinase (HPPK) 
All organisms require reduced folate cofactors for the synthesis of a variety of metabolites. 
Most microorganisms must synthesize folate de novo because they lack the active transport 
system of higher vertebrate cells which allows these organisms to use dietary folates. 
Enzymes involved in folate biosynthesis are therefore targets for a variety of antimicrobial 
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agents such as trimethoprim or sulfonamides. 7,8-dihydro-6-hydroxymethylpterin- 
pyrophosphokinase (EC 2.7.6.3) (HPPK) catalyzes the attachment of pyrophosphate to 6- 
hydroxymethyl-7,8-dihydropterin to form 6-hydroxymethyl-7,8-dihydropteridine 
pyrophosphate. This is the first step in a three-step pathway leading to 7,8-dihydrofolate. 
Bacterial HPPK (gene folK or sulD) [1] is a protein of 160 to 270 amino acids. In the lower 
eukaryote Pneumocystis carinii, HPPK is the central domain of a multifunctional folate 
synthesis enzyme (gene fas) [2]. As a signature for HPPK, a conserved region located in the 
central section of these enzymes was selected. 

Consensus pattern [KRHD]-x-[GA]-[PSAE]-R-x(2)-D-[LIV]-D-[LIVM](2) Sequences 
known to belong to this class detected by the pattern ALL. Other sequence(s) detected in 
SWISS-PROTNONE. 

[ 1] Talarico T.L., Ray P.H., Dev I.K., Merrill B.M., Dallas W.S. J. Bacteriol. 174:5971- 
5977(1992). 

[ 2] Volpes F., Dyer M., Scaife J.G., Darby G., Stammers D.K., Delves C.J. Gene 112:213- 
218(1992). 

896. Metalloenzyme superfamily (Metalloenzyme) 

This family includes phosphopentomutase Swiss:P07651 and 2,3-bisphosphoglycerate- 
independent phosphoglycerate mutase, Swiss:P37689. This family is also related to 
alk_phosphatase [1]. The alignment contains the most conserved residues that are probably 
involved in metal binding and catalysis. Number of members: 34 

[1] Galperin MY, Bairoch A, Koonin EV; Medline: 99180418 "A superfamily of 
metalloenzymes unifies phosphopentomutase and cofactor- independent phosphoglycerate 
mutase with alkaline phosphatases and sulfatases." Protein Sci 1998;7:1829-1835. 

897. Penicillin amidase (Penicil_amidase) 

Penicillin amidase or penicillin acylase EC:3.5.1.11 catalyses the hydrolysis of 
benzylpenicillin to phenylacetic acid and 6-aminopenicillanic acid (6-APA) a key 
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intermediate in the the synthesis of penicillins [1]. Also in the family is cephalosporin acylase 
Swiss:P07662 and Swiss:P29958 aculeacin A acylase which are involved in the synthesis of 
related peptide antibiotics. Number of members: 13 

[1] Verhaert RM, Riemens AM, van der Laan JM, van Duin J, Quax WJ; Medline: 97438505 
"Molecular cloning and analysis of the gene encoding the thermostable penicillin G acylase 
from Alcaligenes faecalis. Appl Environ Microbiol 1997;63:3412-3418. 
[2] Duggleby HJ, Tolley SP, Hill CP, Dodson EJ, Dodson G, Moody PC; Medline: 95115804 
"Penicillin acylase has a single-amino-acid catalytic centre." Nature 1995;373:264-268. 

898. Phosphoribosyl-AMP cyclohydrolase (PRA-CH) 

This enzyme catalyses the third step in the histidine biosynthetic pathway. It requires Zn ions 
for activity. Number of members: 13 

[1] D'Ordine RL, Klem TJ, Davisson VJ; Medline: 99129952 "Nl-(5'- 
phosphoribosyl)adenosine-5'-monophosphate cyclohydrolase: purification and 
characterization of a unique metalloenzyme. Biochemistry 1999;38:1537-1546. 

899. Phosphoribosyl-ATP pyrophosphohydrolase (PRA-PH) 

This enzyme catalyses the second step in the histidine biosynthetic pathway. Number of 
members: 32 

[1] Keesey JK Jr, Bigelis R, Fink GR; Medline: 79216449 "The product of the his4 gene 
cluster in Saccharomyces cerevisiae. A trifunctional polypeptide." J Biol Chem 1979 Aug 
10;254:7427-7433. 

[2] Bruni CB, Carlomagno MS, Formisano S, Paolella G; Medline: 86310274 "Primary and 
secondary structural homologies between the HIS4 gene product of Saccharomyces 
cerevisiae and the hislE and hisD gene products of Escherichia coli and Salmonella 
typhimurium." Mol Gen Genet 1986;203:389-396. 
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900. Prokaryotic membrane lipoprotein lipid attachment site (PstS) 

In prokaryotes, membrane lipoproteins are synthesized with a precursor signal peptide, which 
is cleaved by a specific lipoprotein signal peptidase (signal peptidase II). The peptidase 
recognizes a conserved sequence and cuts upstream of a cysteine residue to which a 
glyceride-fatty acid lipid is attached [1]. Some of the proteins known to undergo such 
processing currently include (for recent listings see [1,2,3]): 

- Major outer membrane lipoprotein (murein-lipoproteins) (gene Ipp). 

- Escherichia coli lipoprotein-28 (gene nlpA). 

- Escherichia coli lipoprotein-34 (gene nlpB). 

- Escherichia coli lipoprotein nlpC. 

- Escherichia coli lipoprotein nlpD. 

- Escherichia coli osmotically inducible lipoprotein B (gene osmB). 

- Escherichia coli osmotically inducible lipoprotein E (gene osmE). 

- Escherichia coli peptidoglycan-associated lipoprotein (gene pal). 

- Escherichia coli rare lipoproteins A and B (genes rplA and rplB). 

- Escherichia coli copper homeostasis protein cutF (or nlpE). 

- Escherichia coli plasmids traT proteins. 

- Escherichia coli Col plasmids lysis proteins. 

- A number of Bacillus beta-lactamases. 

- Bacillus subtilis periplasmic oligopeptide-binding protein (gene oppA). 

- Borrelia burgdorferi outer surface proteins A and B (genes ospA and ospB). 

- Borrelia hermsii variable major protein 21 (gene vmp21) and 7 (gene vmp7). 

- Chlamydia trachomatis outer membrane protein 3 (gene omp3). 

- Fibrobacter succinogenes endoglucanase cel-3. 

- Haemophilus influenzae proteins Pal and Pep. 

- Klebsiella pullulunase (gene pulA). 

- Klebsiella pullulunase secretion protein pulS. 

- Mycoplasma hyorhinis protein p37. 

- Mycoplasma hyorhinis variant surface antigens A, B, and C (genes vlpABC). 

- Neisseria outer membrane protein H.8. 

- Pseudomonas aeruginosa lipopeptide (gene IppL). 

- Pseudomonas solanacearum endoglucanase egl. 

- Rhodopseudomonas viridis reaction center cytochrome subunit (gene cytC). 
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- Rickettsia 17 Kd antigen. 

- Shigella flexneri invasion plasmid proteins mxiJ and mxiM. 

- Streptococcus pneumoniae oligopeptide transport protein A (gene amiA). 

- Treponema pallidium 34 Kd antigen. 

- Treponema pallidium membrane protein A (gene tmpA). 

- Vibrio harveyi chitobiase (gene chb). 

- Yersinia virulence plasmid protein yscJ. 

- Halocyanin from Natrobacterium pharaonis [4], a membrane associated copper-binding 
protein. This is the first archaebacterial protein known to be modified in such a fashion). 
From the precursor sequences of all these proteins, a consensus pattern was derived and a set 
of rules to identify this type of post-translational modification. 

Consensus pattern {DERK}(6)-[LIVMFWSTAG](2)-[LIVMFYSTAGCQ]-[AGS]-C [C is 
the lipid attachment site] Additional rules: 1) The cysteine must be between positions 15 and 
35 of the sequence in consideration. 2) There must be at least one Lys or one Arg in the first 
seven positions of the sequence. Sequences known to belong to this class detected by the 
patternALL. Other sequence(s) detected in SWISS-PROTsome 100 prokaryotic proteins. 
Some of them are not membrane lipoproteins, but at least half of them could be. 

[ 1] Hayashi S., Wu H.C. J. Bioenerg. Biomembr. 22:451-471(1990). 
[ 2] Klein P., Somorjai R.L., Lau P.C.K. Protein Eng. 2:15-20(1988). 
[ 3] von Heijne G. Protein Eng. 2:531-534(1989). 

[ 4] Mattar S., Scharf B., Kent S.B.H., Rodewald K., Oesterhelt D., Engelhard M. J. Biol. 
Chem. 269:14939-14945(1994). 

901. Ribosome recycling factor (RRF) 

The ribosome recycling factor (RRF / ribosome release factor) dissociates the ribosome from 
the mRNA after termination of translation, and is essential bacterial growth [1]. Thus 
ribosomes are "recycled" and ready for another round of protein synthesis. Number of 
members: 27 
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[1] Janosi L, Shimizu I, Kaji A; Medline: 94240115 "Ribosome recycling factor (ribosome 
releasing factor) is essential for bacterial growth." Proc Natl Acad Sci U S A 1994;91:4249- 
4253. 

5 

902. S-layer homology(SLH) 

S-layers are paracrystalline mono-layered assemblies of (glyco)proteins which coat the 
surface of bacteria [1]. Several S-layer proteins and some other cell wall proteins contain one 
or more copies of a domain of about 50-60 residues, which has been called SLH (for S-layer 
1 0 homology) [2]. There is strong evidence that this domain serves as an anchor to the 
peptidoglycan [3]. The SLH domain has been found in: 

- S-layer glycoprotein of Acetogenium kivui (3 copies). 

- S-layer 125 Kd protein of Bacillus sphaericus (3 copies). 

- S-layer protein of Bacillus anthracis (3 copies). 

1 5 - S-layer protein of Bacillus licheniformis (3 copies). 

- S-layer protein (HWP) from Bacillus brevis strain HPD31 (3 copies). 

- Middle cell wall protein (MWP) from Bacillus brevis strain 47 (3 copies). 

- S-layer protein (plOO) of Thermus thermophilus (1 copy). 

- Outer membrane protein Omp-alpha from Thermotoga maritima (1 copy). 

2 0 - Cellulosome anchoring protein (gene ancA), outer layer protein B (OlpB) and a further 

potential cell surface glycoprotein from Clostridium thermocellum (3 copies; the first copy is 
missing its N-terminal third which is appended to the end of the third copy; may have arisen 
by circular permutation). 

- Amylopullulanase (gene amyB) from Thermoanaerobacter thermosulfurogenes (3 copies) 
25 - Amylopullulanase (gene aapT) from Bacillus strain XAL-601 (3 copies). 

- Endoglucanase from Bacillus strain KSM-635 (3 copies). 

- Exoglucanase (gene xynX) from Clostridium thermocellum (3 copies). 

- Xylanase A (gene xynA) from Thermoanaerobacter saccharolyticum (2 copies; 3 copies if a 
frameshift is taken into account). 

3 0 - Protein involved in butirosin production (ButB) from Bacillus circulans (2 incomplete 

copies; 3 copies if three frameshifts are taken into account). 

- Two hypothetical proteins from Synechocystis strain PCC 6803 (1 copy each). 
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- A hypothetical protein with sequence sinriilarity to amylopullulanases found 3' of amylase 
gene from Bacillus circulans (fragment of 1 copy; 3 copies if two frameshifts are taken into 
account). 

SLH domains are found at the N- or C-termini of mature proteins. They occur in single copy 
followed by a predicted coiled coil domain, or in three contiguous copies. Structurally, the 
SLH domain is predicted to contain two alpha-helices flanking a beta strand. The SLH 
sequences are fairly divergent with an average identity of about 25%. It is however possible 
to build a sequence pattern that starts at the second position of the domain and that spans 3/4 
of its length. 

Consensus pattern[LVFYT]-x-[DA]-x(2,5)-[DNGSATPHY]-[FYWPDA]-x(4)-[LIV]-x(2)- 
[GTALV]-x(4,6)-[LIVFYC]-x(2)-G-x-[PGSTA]-x(2,3)-[MFYA]-x- [PGAV]-x(3,10)- 
[LIVMA]-[STKR]-[RY]-x-[EQ]-x-[STALIVM] Sequences known to belong to this class 
detected by the pattern ALL. Other sequence(s) detected in SWISS-PROTNONE. 

[ 1] Beveridge T.J. Curr. Opin. Struct. Biol. 4:204-212(1994). 

[ 2] Lupas A., Engelhardt H., Peters J., Santarius U., Volker S., Baumeister W. J. Bacteriol. 
176:1224-1233(1994). 

[ 3] Lemaire M., Ohayon H., Gounon P., Fujino T., Beguin P. J. Bacteriol. 177:2451- 
2459(1995). 

903. Queuine tRNA-ribosyltransferase (TGT) 

This is a family of queuine tRNA-ribosyltransferases EC;2.4.2.29, also known as tRNA- 
guanine transglycosylase and guanine insertion enzyme. Queuine tRNA-ribosyltransferase 
modifies tRNAs for asparagine, aspartic acid, histidine and tyrosine with queuine. It catalyses 
the exchange of guanine-34 at the wobble position with 7-aminomethyl-7-deazaguanine, and 
the addition of a cyclopentenediol moiety to 7-aminomethyl-7-deazaguanine-34 tRNA; 
giving a hypermodified base queuine in the wobble position [l,2].The aligned region contains 
a zinc binding motif C-x-C-x2-C-x29-H, and important tRNA and 7-aminomethyl- 
7deazaguanine binding residues [1]. Number of members: 27 
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[1] Romier C, Renter K, Suck D, Ficner R; Medline: 96256303 "Crystal structure of tRNA- 
guanine transglycosylase: RNA modification by base exchange." EMBO J 1996;15:2850- 
2857. 

[2] Garcia GA, Koch KA, Cheng S; Medline: 93287116 "tRNA-guanine transglycosylase 
from Escherichia coli. Overexpression, purification and quaternary structure." J Mol Biol 
1993;231:489-497. 

904. ThiC Family (ThiC) 

ThiC is found within the thiamine biosynthesis operon. ThiC is involved in pyrimidine 
biosynthesis [2]. ThiC catalyzes the substitution of the pyrophosphate of 2-methyl-4-amino- 
5-hydroxymethylpyrimidine pyrophosphate by 4-methyl-5-(beta-hydroxyethyl)thiazole 
phosphate to yield thiamine phosphate [3]. Number of members: 12 

[1] Vander Horn PB, Backstrom AD, Stewart V, Begley TP; Medline: 93163063 "Structural 
genes for thiamine biosynthetic enzymes (thiCEFGH) in Escherichia coli K-12." J Bacteriol 
1993;175:982-992. 

[2] Begley TP, Downs DM, Ealick SE, McLafferty FW, Van Loon AP, Taylor S, 
Campobasso N, Chiu HJ, Kinsland C, Reddick JJ, Xi J; Medline: 99311269 "Thiamin 
biosynthesis in prokaryotes." Arch Microbiol 1999;171:293-300. 

[3] Zhang Y, Taylor SV, Chiu HJ, Begley TP; Medline: 97284509 "Characterization of the 
Bacillus subtilis thiC operon involved in thiamine biosynthesis." J Bacteriol 1997;179:3030- 
3035. 

905. Putative tRNA binding domain (tRNA_bind) 

This domain is found in prokaryotic methionyl-tRNA synthetases, prokaryotic phenylalanyl 
tRNA synthetases the yeast GU4 nucleic-binding protein (G4pl or p42, ARCl) [2], human 
tyrosyl-tRNA synthetase [1], and endothelial-monocyte activating polypeptide II. G4pl binds 
specifically to tRNA form a complex with methionyl-tRNA synthetases [2]. In human 
tyrosyl-tRNA synthetase this domain may direct tRNA to the active site of the enzyme [2]. 
This domain may perform a 

common function in tRNA aminoacylation [1]. Number of members: 12 
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[1] Kleeman TA, Wei D, Simpson KL, First EA; Medline: 97306356 "Human tyrosyl-tRNA 
synthetase shares amino acid sequence homology with a putative cytokine." J Biol Chem 
1997;272:14420-14425. 

[2] Simos G, Segref A, Fasiolo F, Hellmuth K, Shevchenko A, Mann M, Hurt EC; Medline: 
97050848 "The yeast protein Arclp binds to tRNA and functions as a cofactor for the 
methionyl-and glutamyl-tRNA synthetases." EMBO J 1996;15:5437-5448. 

906. UbiA prenyltransferase family signature (UbiA) 

The following prenyltransferases are evolutionary related [1,2]: 

- Bacterial 4-hydroxybenzoate octaprenyltransferase (gene ubiA). 

- Yeast mitochondrial para-hydroxybenzoate-polyprenyltransferase (gene COQ2). 

- Protoheme IX farnesyltransferase (heme O synthase) from yeast and mammals (gene 
COXIO) and from bacteria (genes cyoE or ctaB). 

These proteins probably contain seven transmembrane segments. The best conserved region 
is located in a loop between the second and third of these segments and was used as a 
signature pattern. 

Consensus pattern N-x(3)-[DE]-x(2)-[LIF]-D-x(2)-[VM]-x-R-[ST]-x(2)-R-x(4)-G Sequences 
known to belong to this class detected by the pattern ALL. Other sequence(s) detected in 
SWISS-PROTNONE. 

[ 1] Melzer M., Heide L. Biochim. Biophys. Acta 1212:93-102(1994). 
[ 2] Mogi T., Saiki K., Anraku Y. Mol. Microbiol. 14:391-398(1994). 

907. Uncharacterized protein family UPF0044 signature (UPF0044) 

The following uncharacterized proteins have been shown [1] to be highly similar: 

- Bacillus subtilis hypothetical protein yqel. 

- Escherichia coli hypothetical protein yhbY and HI1333, the corresponding Haemophilus 
influenzae protein. 
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- Methanococcus jannaschii hypothetical protein MJ0652. 

These are small proteins of 10 to 15 Kd. They can be picked up in the database by the 
following pattern. This pattern is located in the N-terminal part of these proteins. 

Consensus pattern L-[ST]-x(3)-K-x(3)-[KR]-[SGA]-x-[GA]-H-x-L-x-P-[LIV]-x(2)- [LIV]- 
[GA]-x(2)-G Sequences known to belong to this class detected by the patternALL. 

908. ATP synthase (C/AC39) subunit (vATP-synt_AC39) 

This family includes the AC39 subunit from vacuolar ATP synthase Swiss:P32366 [1], and 
the C subunit from archaebacterial ATP synthase [2]. The family also includes subunit C 
from the Sodium transporting ATP synthase from Enterococcus hirae Swiss:P43456 [3]. 
Number of members: 12 

[1] Bauerle C, Ho MN, Lindorfer MA, Stevens TH; Medline: 93286119 "The Saccharomyces 
cerevisiae VMA6 gene encodes the 36-kDa subunit of the vacuolar H(+)-ATPase membrane 
sector." J Biol Chem 1993;268:12749-12757. 

[2] Wilms R, Freiberg C, Wegerle E, Meier I, Mayer F, Muller V; Medline: 96324968 
"Subunit structure and organization of the genes of the AlAO ATPase from the Archaeon 
Methanosarcina mazei Gol." J Biol Chem 1996;271:18843-18852. 
[3] Takase K, Kakinuma S, Yamato I, Konishi K, Igarashi K, Kakinuma Y; Medline: 
94209269 "Sequencing and characterization of the ntp gene cluster for vacuolar- type Na(+)- 
translocating ATPase of Enterococcus hirae." J Biol Chem 1994;269:11037-11044. 

909. ATP synthase (E/31 kDa) subunit (vATP-synt_E) 

This family includes the vacuolar ATP synthase E subunit [1], as well as the archaebacterial 
ATP synthase E subunit [2]. Number of members: 24 

[1] Foury F; Medline: 91009356 "The 31-kDa polypeptide is an essential subunit of the 
vacuolar ATPase in Saccharomyces cerevisiae." J Biol Chem 1990;265:18554-18560. 
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[2] Wilms R, Freiberg C, Wegerle E, Meier I, Mayer F, Muller V; Medline: 96324968 
"Subunit structure and organization of the genes of the Al AO ATPase from the Archaeon 
Methanosarcina mazei Gol." J Biol Chem 1996;271:18843-18852. 

910. (WW) 

The WW domain [1-4,E1] (also known as rsp5 or WWP) has been originally discovered as a 
short conserved region in a number of unrelated proteins, among them dystrophin, the gene 
responsible for Duchenne muscular dystrophy. The domain, which spans about 35 residues, is 
repeated up to 4 times in some proteins. It has been shown [5] to bind proteins with particular 
proline- motifs, [AP]-P-P-[AP]-Y, and thus resembles somewhat SH3 domains. It appears to 
contain beta-strands grouped around four conserved aromatic positions; generally Trp. The 
name WW or WWP derives from the presence of these Trp as well as that of a conserved Pro. 
It is frequently associated with other domains typical for proteins in signal transduction 
processes. 

Proteins containing the WW domain are listed below. 

- Dystrophin, a multidomain cytoskeletal protein. Its longest alternatively spliced form 
consists of an N-terminal actin-binding domain, followed by 24 spectrin-like repeats, a 
cysteine-rich calcium-binding domain and a C- terminal globular domain. Dystrophin form 
tetramers and is thought to have multiple functions including involvement in membrane 
stability, transduction of contractile forces to the extracellular environment and organization 
of membrane specialization. Mutations in the dystrophin gene lead to muscular dystrophy of 
Duchenne or Becker type. Dystrophin contains one WW domain C-terminal of the spectrin- 
repeats. 

- Utrophin, a dystrophin-like protein of unknown function. 

- Vertebrate YAP protein is a substrate of an unknown serine kinase. It binds to the SH3 
domain of the Yes oncoprotein via a proline-rich region. This protein appears in alternatively 
spliced isoforms, containing either one or two WW domains [6]. 

- Mouse NEDD-4 plays a role in the embryonic development and differentiation of the 
central nervous system. It contains 3 WW modules followed by a HECT domain. The human 
ortholog contains 4 WW domains, but the third WW domain is probably spliced resulting in 
an alternate NEDD-4 protein with only 3 WW modules [3]. 
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- Yeast RSP5 is similar to NEDD-4 in its molecular organization. It contains an N-terminal 
C2 domain (see <PDOC00380>, followed by a histidine-rich region, 3 WW domains and a 
HECT domain. 

- Rat FE65, a transcription-factor activator expressed preferentially in liver. The activator 
domain is located within the N-terminal 232 residues of FE65, which also contain the WW 
domain. 

- Yeast ESSl/PTFl, a putative peptidyl prolyl cis-trans isomerase from family ppiC (see 
<PDOC00840>). A related protein, dodo (gene dod) exists in Drosophila and in mammals 
(gene PINl). 

- Tobacco DBIO protein. The WW domain is located N-terminal to the region with similarity 
to ATP-dependent RNA helicases. 

- IQGAP, a human GTPase activating protein acting on ras. It contains an N- terminal 
domain similar to fly muscle mp20 protein and a C-terminal ras GTPase activator domain. 

- Yeast pre-mRNA processing protein PRP40, Caenorhabditis elegans ZK1098.1 and fission 
yeast SpAC13C5.02 are related proteins with similarity to MY02- type myosin, each 
containing two WW-domains at the N-terminus. 

- Caenorhabditis elegans hypothetical protein C38D4.5, which contains one WW module, a 
PH domain (see <PDOC50003>) and a C-terminal phosphatidylinositol 3-kinase domain. 

- Yeast hypothetical protein YFLOlOc. 

For the sensitive detection of WW domains, a profile was developed which spans the whole 
homology region as well as a pattern. 

Consensus pattern W-x(9,ll)-[VFY]-[FYW]-x(6,7)-[GSTNE]-[GSTQCR]-[FYW]-x(2)-P 
Sequences known to belong to this class detected by the pattern ALL. Other sequence(s) 
detected in SWISS-PROT8. Sequences known to belong to this class detected by the 
profileALL. 

[ 1] Bork P., Sudol M. Trends Biochem. Sci. 19:531-533(1994). 

[ 2] Andre B., Springael J.Y. Biochem. Biophys. Res. Commun. 205:1201-1205(1994). 
[ 3] Hofmann K.O., Bucher P. FEES Lett. 358:153-157(1995). 

[ 4] Sudol M., Chen H.L, Bougeret C, Einbond A., Bork P. FEES Lett. 369:67-71(1995). 
[ 5] Chen H.L, Sudol M. Proc. Natl. Acad. Sci. U.S.A. 92:7819-7823(1995). 
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[ 6] Sudol M., Bork P., Einbond A., Kastury K., Druck T., Negrini M., Huebner K., Lehman 
D. J. Biol. Chem. 270:14733-14741(1995). 

911. Xeroderma pigmentosum (XP) [1] (XPG_1) 

Xeroderma pigmentosum (XP) [1] is a human autosomal recessive disease, characterized by a 
high incidence of sunlight-induced skin cancer. People's skin cells with this condition are 
hypersensitive to ultraviolet light, due to defects in the incision step of DNA excision repair. 
There are a minimum of seven genetic complementation groups involved in this pathway: 
XP-A to XP-G. The defect in XP-G can be corrected by a 133 Kd nuclear protein called XPG 
(orXPGC) [2]. 

XPG belongs to a family of proteins [2,3,4,5,6] that are composed of two main subsets: 

- Subset 1, to which belongs XPG, RAD2 from budding yeast and radl3 from fission yeast. 
RAD2 and XPG are single-stranded DNA endonucleases [7,8]. XPG makes the 3'incision in 
human DNA nucleotide excision repair [9]. 

- Subset 2, to which belongs mouse and human FEN-1, rad2 from fission yeast, and RAD27 
from budding yeast. FEN-1 is a structure-specific endonuclease. 

In addition to the proteins listed in the above groups, this family also includes: 

- Fission yeast exol, a 5'->3' double-stranded DNA exonuclease that could act in a pathway 
that corrects mismatched base pairs. 

- Yeast EXOl (DHSl), a protein with probably the same function as exol. 

- Yeast DIN7. 

Sequence alignment of this family of proteins reveals that similarities are largely confined to 
two regions. The first is located at the N-terminal extremity (N-region) and corresponds to 
the first 95 to 105 amino acids. The second region is internal (1-region) and found towards the 
C-terminus; it spans about 140 residues and contains a highly conserved core of 27 amino 
acids that includes a conserved pentapeptide (E-A-[DE]-A-[QS]). It is possible that the 
conserved acidic residues are involved in the catalytic mechanism of DNA excision repair in 
XPG. The amino acids linking the N- and I-regions are not conserved; indeed, they are 
largely absent from proteins belonging to the second subset. 
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Two signature patterns were developed for these proteins. The first corresponds to the central 
part of the N-region, the second to part of the I-region and includes the putative catalytic core 
pentapeptide. 

Consensus pattern [VI]-[KRE]-P-x-[FYIL]-V-F-D-G-x(2)-[PIL]-x-[LVC]-K Sequences 
known to belong to this class detected by the patternALL. Other sequence(s) detected in 
SWISS-PROTNONE. 

Consensus pattern [GS]-[LIVM]-[PER]-[FYS]-[LIVM]-x-A-P-x-E-A-[DE]-[PAS]- [QS]- 
[CLM] Sequences known to belong to this class detected by the patternALL. Other 
sequence(s) detected in SWISS-PROTNONE. 

[ 1] Tanaka K., Wood R.D. Trends Biochem. Sci. 19:83-86(1994). 

[ 2] Scherly D., Nouspikel T., Corlet J., Ucla C, Bairoch A., Clarkson S.G. Nature 363:182- 
185(1993). 

[ 3] Carr A.M., Sheldrick K.S., Murray J.M., Al-Harithy R., Watts F.Z., Lehmann A.R. 
Nucleic Acids Res. 21:1345-1349(1993). 

[ 4] Murray J.M., Tavassoli M., Al-Harithy R., Sheldrick K.S., Lehmann A.R., Carr A.M., 

Watts F.Z. Mol. Cell. Biol. 14:4878-4888(1994). 

[ 5] Harrington J.J., Lieber M.R. Genes Dev. 8:1344-1355(1994). 

[ 6] Szankasi P., Smith G.R. Science 267:1166-1169(1995). 

[ 7] Habraken Y., Sung P., Prakash L., Prakash S. Nature 366:365-368(1993). 

[ 8] O'Donovan A., Scherly D., Clarkson S.G., Wood R.D. J. Biol. Chem. 269:15965- 

15968(1994). 

[ 9] O'Donovan A., Davies A.A., Moggs J.G., West S.C., Wood R.D. Nature 371:432- 
435(1994). 

912. 5-formyltetrahydrofolate cyclo-ligase (5-FTHF_cyc-lig) 

5-formyltetrahydrofolate cyclo-ligase or methenyl-THF synthetase EC: 6.3. 3.2 catalyses the 
interchange of 5-formyltetrahydrofolate (5-FTHF) to 5-10-methenyltetrahydrofolate, this 
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requires ATP and Mg2+ [1]. 5-FTHF is used in chemotherapy where it is clinically known as 

Leucovorin [2]. 

Number of members: 23 

[1] Dayan A, Bertrand R, Beauchemin M, Chahla D, Mamo A, Filion M, Skup D, Massie B, 
Jolivet J; Medline: 96096540 "Cloning and characterization of the human 5,10- 
methenyltetrahydrofolate synthetase-encoding cDNA." Gene 1995;165:307-311. 
[2] Maras B, Stover P, Valiante S, Barra D, Schirch V; Medline: 94308074 "Primary 
structure and tetrahydropteroylglutamate binding site of rabbit liver cytosolic 5,10- 
methenyltetrahydrofolate synthetase." J Biol Chem 1994;269:18429-18433. 

913. Cytosolic long-chain acyl-CoA thioester hydrolase (Acyl-CoA_hydro) 

This family consist of various cytosolic long-chain acyl-CoA thioester hydrolases including 
human and rat [1,2]. The aligned region is repeated with in the sequence of human and rat 
cytosolic long-chain acyl-CoA thioester hydrolases of this family. Long-chain acyl-CoA 
hydrolases hydrolyse palmitoyl-CoA to CoA and palmitate, they also catalyse the hydrolysis 
of other long chain fatty acyl-CoA thioesters. Long-chain acyl-CoA hydrolases are present in 
all living organisms and they may provide a mechanism for the control of lipid metabolism 
[!]• 

Number of members: 24 

[l]Yamada J, Furihata T, lida N, Watanabe T, Hosokawa M, Satoh T, Someya A, Nagaoka I, 
Suga T; Medline: 97236308 "Molecular cloning and expression of cDNAs encoding rat brain 
and liver cytosolic long-chain acyl-CoA hydrolases." Biochem Biophys Res Commun 
1997;232:198-203. 

[2] Broustas CG, Larkins LK, Uhler MD, Hajra AK; Medline: 96209964 "Molecular cloning 
and expression of cDNA encoding rat brain cytosolic acyl-coenzyme A thioester hydrolase." 
J Biol Chem 1996;271:10470-10476. 

914. Agglutinin 

Lectin (probable mannose binding) 
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Members of this family are plant lectins. Many if not all are mannose specific. 
Number of members: 87 

[1] Wright CS, Hester G; Medline: 97094989 "The 2.0 A structure of a cross-linked complex 
between snowdrop lectin and a branched mannopentaose: evidence for two unique binding 
modes." Structure 1996;4:1339-1352. 

915. (ANF_RECEPTORS) 

Natriuretic peptides are hormones involved in the regulation of fluid and electrolyte 
homeostasis. These hormones stimulate the intracellular production of cyclic GMP as a 
second messenger. 

Currently, three types of natriuretic peptide receptors are known [1,2]. Two express guanylate 
cyclase activity: GC-A (or ANP-A) which seems specific to atrial natriuretic peptide (ANP), 
and GC-B (or ANP-B) which seems to be stimulated more effectively by brain natriuretic 
peptide (BNP) than by ANP. The third receptor (ANP-C) is probably responsible for the 
clearance of ANP from the circulation and does not play a role in signal transduction. 

GC-A and GC-B are plasma membrane-bound proteins that share the following topology: an 
N-terminal extracellular domain which acts as the ligand binding region, then a 
transmembrane domain followed by a large cytoplasmic C- terminal region that can be 
subdivided into two domains: a protein kinase-like domain (see <PDOC00100>) that appears 
important for proper signalling and a guanylate cyclase catalytic domain (see 
<PDOC00425>). The topology of ANP-C is different: like GC-A and -B it possesses an 
extracellular ligand-binding region and a transmembrane domain, but its cytoplasmic domain 
is very short. 

A pattern was developed from the ligand-binding region of natriuretic peptide receptors based 
on a highly conserved region located in the N-terminal part of the domain. 

Consensus patternG-P-x-C-x-Y-x-A-A-x-V-x-R-x(3)-H-W Sequences known to belong to 
this class detected by the patternALL. Other sequence(s) detected in SWISS-PROTNONE. 
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[ 1] Garbers D.L. New Biol. 2:499-504(1990). 

[ 2] Schulz S., Chinkers M., Garbers D.L. FASEB J. 2:2026-2035(1989). 

5 916. (Apocytochrome) 

Cytochrome c family heme-binding site signature 

In proteins belonging to cytochrome c family [1], the heme group is covalently attached by 
thioether bonds to two conserved cysteine residues. The consensus sequence for this site is 
1 0 Cys-X-X-Cys-His and the histidine residue is one of the two axial ligands of the heme iron. 
This arrangement is shared by all proteins known to belong to cytochrome c family, which 
presently includes cytochromes c, d, cl to c6, c550 to c556, cc3/Hmc, cytochrome f and 
reaction center cytochrome c. 

1 5 Consensus patternC-{CPWHF}-{CPWR}-C-H-{CFYW} Sequences known to belong to this 
class detected by the patternALL, except for four cytochrome c's which lack the first 
thioether bond. Other sequence(s) detected in SWISS-PROT454. 

Note: some cytochrome c's have more than a single bound heme groupc4 has 2, c7 has 3, c3 
20 has 4, the reaction center has 4, and cc3/Hmc has 16 ! 

[ 1] Mathews F.S. Prog. Biophys. Mol. Biol. 45:1-56(1985). 

917. ATP-synt_A-c. ATP synthase Alpha chain, C terminal 

25 [1] Medline: 94344236. Structure at 2.8 A resolution of Fl-ATPase from bovine heart 

mitochondria. Abrahams JP, Leslie AG, Lutter R, Walker JE; Nature 1994;370:621-628. 
Number of members: 125 

918. (Basic) 

3 0 Myc-type, 'helix-loop-helix' dimerization domain signature 
HELIX_LOOP_HELIX 
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A number of eukaryotic proteins, which probably are sequence specific DNA- binding 
proteins that act as transcription factors, share a conserved domain of 40 to 50 amino acid 
residues. It has been proposed [1] that this domain is formed of two amphipathic helices 
joined by a variable length linker region that could form a loop. This 'helix-loop-helix' (HLH) 
domain mediates protein dimerization and has been found in the proteins listed below 
[2,3,E1,E2]. Most of these proteins have an extra basic region of about 15 amino acid 
residues that is adjacent to the HLH domain and specifically binds to DNA. They are refered 
as basic helix-loop-helix proteins (bHLH), and are classified in two groups: class A 
(ubiquitous) and class B (tissue-specific). Members of the bHLH family bind variations on 
the core sequence 'CANNTG', also refered to as the E-box motif. The homo- or 
heterodimerization mediated by the HLH domain is independent of, but necessary for DNA 
binding, as two basic regions are required for DNA binding activity. The HLH proteins 
lacking the basic domain (Emc, Id) function as negative regulators since they form 
heterodimers, but fail to bind DNA. The hairy-related proteins (hairy, E(spl), deadpan) also 
repress transcription although they can bind DNA. The proteins of this subfamily act together 
with CO -repressor proteins, like groucho, through their C-terminal motif WRPW. 

- The myc family of cellular oncogenes [4], which is currently known to contain four 
members: c-myc [E3], N-myc, L-myc, and B-myc. The myc genes are thought to play a role 
in cellular differentiation and proliferation. 

- Proteins involved in myogenesis (the induction of muscle cells). In mammals MyoDl 
(Myf-3), myogenin (Myf-4), Myf-5, and Myf-6 (Mrf4 or herculin), in birds CMDl (QMF-1), 
in Xenopus MyoD and MF25, in Caenorhabditis elegans CeMyoD, and in Drosophila 
nautilus (nau). 

- Vertebrate proteins that bind specific DNA sequences ('E boxes') in various 
immunoglobulin chains enhancers: E2A or ITF-1 (E12/pan-2 and E47/pan-l), ITF-2 (tcf4), 
TFE3, and TFEB. 

- Vertebrate neurogenic differentiation factor 1 that acts as differentiation factor during 
neurogenesis. 

- Vertebrate MAX protein, a transcription regulator that forms a sequence- specific DNA- 
binding protein complex with myc or mad. 

- Vertebrate Max Interacting Protein 1 (MXIl protein) which acts as a transcriptional 
repressor and may antagonize myc transcriptional activity by competing for max. 
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- Proteins of the bHLH/PAS superfamily which are transcriptional activators. In mammals, 
AH receptor nuclear translocator (ARNT), single-minded homologs (SIMl and S1M2), 
hypoxia-inducible factor 1 alpha (HIFIA), AH receptor (AHR), neuronal pas domain proteins 
(NPASl and NPAS2), endothelial pas domain protein 1 (EPASl), mouse ARNT2, and 
human BMALl. In drosophila, single-minded (SIM), AH receptor nuclear translocator 
(ARNT), trachealess protein (TRH), and similar protein (SIMA). 

- Mammalian transcription factors HES, which repress transcription by acting on two types 
of DNA sequences, the E box and the N box. 

- Mammalian MAD protein (max dimerizer) which acts as transcriptional repressor and may 
antagonize myc transcriptional activity by competing for max. 

- Mammalian Upstream Stimulatory Factor 1 and 2 (USFl and USF2), which bind to a 
symmetrical DNA sequence that is found in a variety of viral and cellular promoters. 

- Human lyl-1 protein; which is involved, by chromosomal translocation, in T- cell leukemia. 

- Human transcription factor AP-4. 

- Mouse helix-loop-helix proteins MATH-1 and MATH-2 which activate E box- dependent 
transcription in collaboration with E47. 

- Mammalian stem cell protein (SCL) (also known as tall), a protein which may play an 
important role in hemopoietic differentiation. SCL is involved, by chromosomal 
translocation, in stem-cell leukemia. 

- Mammalian proteins Idl to Id4 [5]. Id (inhibitor of DNA binding) proteins lack a basic 
DNA-binding domain but are able to form heterodimers with other HLH proteins, thereby 
inhibiting binding to DNA. 

- Drosophila extra-macrochaetae (emc) protein, which participates in sensory organ 
patterning by antagonizing the neurogenic activity of the achaete- scute complex. Emc is the 
homolog of mammalian Id proteins. 

- Human Sterol Regulatory Element Binding Protein 1 (SREBP-1), a transcriptional activator 
that binds to the sterol regulatory element 1 (SRE-1) found in the flanking region of the 
LDLR gene and in other genes. 

- Drosophila achaete-scute (AS-C) complex proteins T3 (I'sc), T4 (scute), T5 (achaete) and 
T8 (asense). The AS-C proteins are involved in the determination of the neuronal precursors 
in the peripheral nervous system and the central nervous system. 

- Mammalian homologs of achaete-scute proteins, the MASH-1 and MASH-2 proteins. 

- Drosophila atonal protein (ato) which is involved in neurogenesis. 
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- Drosophila daughterless (da) protein, which is essential for neurogenesis and sex- 
determination. 

- Drosophila deadpan (dpn), a hairy-like protein involved in the functional differentiation of 
neurons. 

- Drosophila delilah (dei) protein, which is plays an important role in the differentiation of 
epidermal cells into muscle. 

- Drosophila hairy (h) protein, a transcriptional repressor which regulates the embryonic 
segmentation and adult bristle patterning. 

- Drosophila enhancer of split proteins E(spl), that are hairy-like proteins active during 
neurogenesis, also act as transcriptional repressors. 

- Drosophila twist (twi) protein, which is involved in the establishment of germ layers in 
embryos. 

- Maize anthocyanin regulatory proteins R-S and LC. 

- Yeast centromere-binding protein 1 (CPFl or CBFl). This protein is involved in 
chromosomal segregation. It binds to a highly conserved DNA sequence, found in centromers 
and in several promoters. 

- Yeast IN02 and IN04 proteins. 

- Yeast phosphate system positive regulatory protein PH04 which interacts with the 
upstream activating sequence of several acid phosphatase genes. 

- Yeast serine-rich protein TYE7 that is required for ty-mediated ADH2 expression. 

- Neurospora crassa nuc-1, a protein that activates the transcription of structural genes for 
phosphorus acquisition. 

- Fission yeast protein escl which is involved in the sexual differentiation process. 

The schematic representation of the helix-loop-helix domain is shown here: 

xxxxxxxxxxxxxxxxxxxxxxxx xxxxxxxxxxxxxxxxxxxxxxx Amphipathic 

helix 1 Loop Amphipathic helix 2 

The signature pattern that had been developed to detect this domain spans completely the 
second amphipathic helix. 



Consensus pattern[DENSTAP]-[KR]-[LIVMAGSNT]-{FYWCPHKR}-[LIVMT]-[LIVM]- 
x(2)-[STAV]-[LIVMSTACKR]-x-[VMFYH]-[LIVMTA]-{P}-{P}-[LIVMRKHQ] 
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Sequences known to belong to this class detected by the pattern the majority but far from all. 
Other sequence(s) detected in SWISS -PROTl 35. 

[ 1] Murre C, McCaw P.S., Baltimore D. Cell 56:777-783(1989). 
[ 2] Garrel J., Campuzano S. BioEssays 13:493-498(1991). 
[ 3] Kato G.J., Dang C.V. FASEB J. 6:3065-3072(1992). 

[ 4] Krause M., Fire A., Harrison S.W., Priess J., Weintraub H. Cell 63:907-919(1990). 
[ 5] Riechmann V., van Cruechten I., Sablitzky F. Nucleic Acids Res. 22:749-755(1994). 

919. (Beta-lactamase) 

Beta-lactamases classes -A, -C, and -D active site 

Beta-lactamases (EC 3.5.2.6) [1,2] are enzymes which catalyze the hydrolysis of an amide 
bond in the beta-lactam ring of antibiotics belonging to the penicillin/cephalosporin 
family. Four kinds of beta-lactamase have been identified [3]. Class-B enzymes are zinc 
containing proteins whilst class -A, C and D enzymes are serine hydrolases. The three 
classes of serine beta- 
lactamases are evolutionary related and belong to a superfamily [4] that also includes DD- 
peptidases and a variety of other penicillin-binding proteins (PBP's). All these proteins 
contain a Ser-x-x-Lys motif, where the serine is the active site residue. Although clearly 
homologous, the sequences of the three classes of serine beta-lactamases exhibit a large 
degree of variability and only a small number of residues are conserved in addition to the 
catalytic serine. 

Since a pattern detecting all serine beta-lactamases would also pick up many unrelated 
sequences, it was decided to provide specific patterns, centered on the active site serine, for 
each of the three classes. 

Consensus pattern [FY]-x-[LIVMFY]-x-S-[TV]-x-K-x(4)-[AGLM]-x(2)-[LC] [S is the active 
site residue] Sequences known to belong to this class detected by the patternALL class-A 
beta-lactamases. Other sequence(s) detected in SWISS-PR0T7. 
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Consensus pattern F-E-[LIVM]-G-S-[LIVMG]-[SA]-K [The first S is the active site residue] 
Sequences known to belong to this class detected by the patternALL class-C beta-lactamases. 
Other sequence(s) detected in SWISS-PROTNONE. 

Consensus pattern [PA]-x-S-[ST]-F-K-[LIV]-[PAL]-x-[STA]-[LI] [S is the active site 
residue] Sequences known to belong to this class detected by the patternALL class-D beta- 
lactamases. Other sequence(s) detected in SWISS-PROTNONE. 

[ 1] Ambler R.P. Philos. Trans. R. Soc. Lond., B, Biol. Sci. 289:321-331(1980). 

[ 2] Pastor N., Pinero D., Valdes A.M., Soberon X. Mol. Microbiol. 4:1957-1965(1990). 

[ 3] Bush K. Antimicrob. Agents Chemother. 33:259-263(1989). 

[ 4] Joris B., Ghuysen J.-M., Dive G., Renard A., Dideberg O., Charlier P., Frere J.M., Kelly 
J.A., Boyington J.C., Moews P.C., Knox J.R. Biochem. J. 250:313-324(1988). 

920. Biotin protein ligase (BPL) 

Biotin is covalently attached at the active site of certain enzymes that transfer carbon dioxide 
from bicarbonate to organic acids to form cellular metabolites. Biotin protein ligase (BPL) is 
the enzyme responsible for attaching biotin to a specific lysine at the active site of biotin 
enzymes. Each organism probably has only one BPL. Biotin attachment is a two step 
reaction that results in the formation of an amide linkage between the carboxyl group of 
biotin and the epsilon-amino group of the modified lysine [2]. 
Number of members: 26 

[1] Wilson KP, Shewchuk LM, Brennan RG, Otsuka AJ, Matthews BW; Medline: 93028443 

"Escherichia coli biotin holoenzyme synthetase/bio repressor crystal structure delineates the 

biotin- and DNA-binding domains." Proc Natl Acad Sci USA 1992;89:9257-9261. 

[2] Chapman-Smith A, Cronan JE Jr; Medline: 10470036 "The enzymatic biotinylation of 

proteins: a post-translational modification of exceptional specificity." Trends Biochem Sci 

1999;24:359-363. 



921. (BRCA2_repeat) 
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The alignment covers only the most conserved region of the repeat. Respiratory-chain NADH 
dehydrogenase 30 Kd subunit signature 

[1] Bork P, Blomberg N, Nilges M; Medline: 96241568 "Internal repeats in the BRCA2 
protein sequence." Nat Genet 1996;13:22-23. 

Number of members: 63 

922. (C6) 

This domain of unknown function is found in the C. elegans protein Swiss :Q1 9522. It is 
presumed to be an extracellular domain. The C6 domain contains six conserved cysteine 
residues in most copies of the domain. However some copies of the domain are missing 
cysteine residues 1 and 3 suggesting that these form a disulphide bridge. 
Number of members: 23 

923. Cadherin cytoplasmic region (Cadherin_C_term) 

Cadherins are vital in cell-cell adhesion during tissue differentiation. Cadherins are linked to 
the cytoskeleton by catenins. Catenins bind to the cytoplasmic tail of the cadherin. Cadherins 
cluster to form foci of homophilic binding units. A key determinant to the strength of the 
binding that it is mediated by cadherins is the juxtamembrane region of the cadherin. This 
region induces clustering and also binds to the protein pl20ctn [1]. 
Number of members: 59 

[1] Yap AS, Niessen CM, Gumbiner BM; Medline: 98234411 "The juxtamembrane region of 
the cadherin cytoplasmic tail supports lateral clustering, adhesive strengthening, and 
interaction with pl20ctn." J Cell Biol 1998;141:779-789. 

[2] Barth AI, Nathke IS, Nelson WJ; Medline: 97471931 "Cadherins, catenins and APC 
protein: interplay between cytoskeletal complexes and signaling pathways." Curr Opin Cell 
Biol 1997;9:683-690. 
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[3] Braga VM, Machesky LM, Hall A, Hotchin NA; Medline: 97327766 "The small GTPases 
Rho and Rac are required for the establishment of cadherin-dependent cell-cell contacts." J 
Cell Biol 1997;137:1421-1431. 

924. Clathrin propeller repeat (Clathrin_propel) 

Clathrin is the scaffold protein of the basket-like coat that surrounds coated vesicles. The 
soluble assembly unit, a triskelion, contains three heavy chains and three light chains in an 
extended three-legged structure. Each leg contains one heavy and one light chain. The N- 
terminus of the heavy chain is known as the globular domain, and is composed of seven 
repeats which form a beta propeller [1]. 
Number of members: 61 

[1] ter Haar E, Musacchio A, Harrison SC, Kirchhausen T; Medline: 99043510 "Atomic 
structure of clathrin: a beta propeller terminal domain joins an alpha zigzag linker." Cell. 
1998;95:563-573. 

925. Respiratory-chain NADH dehydrogenase 30 Kd subunit signature (complexl_30Kd) 

Respiratory-chain NADH dehydrogenase (EC 1.6.5.3) [1,2] (also known as complex I or 
NADH-ubiquinone oxidoreductase) is an oligomeric enzymatic complex located in the 
inner mitochondrial membrane which also seems to exist in the chloroplast and in 
cyanobacteria (as a NADH-plastoquinone oxidoreductase). Among the 25 to 30 polypeptide 
subunits of this bioenergetic enzyme complex there is one with a molecular weight of 30 
Kd (in mammals) which has been found to be: 

- Nuclear encoded, as a precursor form with a transit peptide in mammals, and in Neurospora 
crassa. 

- Mitochondrial encoded in Paramecium (protein PI), and in the slime mold Dictyostelium 
discoideum (ORE 209). 

- Chloroplast encoded in various higher plants (ORF 159). It is also present in bacteria: 

- In the cyanobacteria Synechocystis strain PCC 6803 (gene ndhJ). 

- Subunit C of Escherichia coli NADH-ubiquinone oxidoreductase (gene nuoC). 

- Subunit NQ05 of Paracoccus denitrificans NADH-ubiquinone oxidoreductase. 
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This protein, in its mature form, consists of from 157 to 266 amino acid residues. The 
best conserved region is located in the C-terminal section and can be used as a signature 
pattern. 

Consensus pattern E-R-E-x(2)-[DE]-[LIVMFY](2)-x(6)-[HK]-x(3)-[KRP]-x-[LIVM]- 
[LIVMYS] Sequences known to belong to this class detected by the pattern ALL. Other 
sequence(s) detected in SWISS-PROTNONE. 

[ 1] Ragan C.I. Curr. Top. Bioenerg. 15:1-36(1987). 

[ 2]Weiss H., Friedrich T., Hofhaus G., Preis D. Eur. J. Biochem. 197:563-576(1991). 

926. Respiratory-chain NADH dehydrogenase 49 Kd subunit signature (complexl_49Kd) 

Respiratory-chain NADH dehydrogenase (EC 1.6.5.3) [1,2] (also known as complex I or 
NADH-ubiquinone oxidoreductase) is an oligomeric enzymatic complex located in the 
inner mitochondrial membrane which also seems to exist in the chloroplast and in 
cyanobacteria (as a NADH-plastoquinone oxidoreductase). Among the 25 to 30 polypeptide 
subunits of this bioenergetic enzyme complex there is one with a molecular weight of 49 Kd 
(in mammals), which is the third largest subunit of complex I and is a component of the 
iron-sulfur (IP) fragment of the enzyme. It seems to bind a 4Fe-4S iron-sulfur cluster. The 49 
Kd subunit has been found to be: 

- Nuclear encoded, as a precursor form with a transit peptide in mammals, and in Neurospora 
crassa. 

- Mitochondrial encoded in protozoan such as Paramecium (ORF 400), Leishmania and 
Trypanosoma (MURF 3). 

- Chloroplast encoded in various higher plants (ORF 392). 
The 49 Kd subunit is highly similar to [3,4]: 

- Subunit D of Escherichia coli NADH-ubiquinone oxidoreductase (gene nuoD). 

- Subunit NQ04 of Paracoccus denitrificans NADH-ubiquinone oxidoreductase. 

- Subunit 5 of Escherichia coli formate hydrogenlyase (gene hycE). 

- Subunit G of Escherichia coli hydrogenase-4 (gene hyfG). 

A highly conserved region was seleceted as signature pattern, located in the N-terminal 
section of this subunit. 
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Consensus pattern [LIVMH]-H-[RT]-[GA]-x-E-K-[LIVMTN]-x-E-x-[KRQ] Sequences 
known to belong to this class detected by the patternALL. 

[ 1] Ragan C.I. Curr. Top. Bioenerg. 15:1-36(1987). 

[ 2] Weiss H., Friedrich T., Hofhaus G., Preis D. Eur. J. Biochem. 197:563-576(1991). 

[ 3] Fearnley I.M., Walker J.E. Biochim. Biophys. Acta 1140:105-134(1992). 

[ 4] Weidner U., Geier S., Ptock A., Friedrich T., Leif H., Weiss H. J. Mol. Biol. 233:109- 

122(1993). 

927. (COX2) 

Cytochrome c oxidase (EC 1.9.3.1) [1,2] is an oligomeric enzymatic complex which is a 
component of the respiratory chain and is involved in the transfer of electrons from 
cytochrome c to oxygen. In eukaryotes this enzyme complex is located in the mitochondrial 
inner membrane; in aerobic prokaryotes it is found in the plasma membrane. The enzyme 
complex consists of 3-4 subunits (prokaryotes) to up to 13 polypeptides (mammals). 

Subunit 2 (CO II) transfers the electrons from cytochrome c to the catalytic subunit 1. It 
contains two adjacent transmembrane regions in its N-terminus and the major part of the 
protein is exposed to the periplasmic or to the mitochondrial intermembrane space, 
respectively. CO II provides the substrate- binding site and contains a copper center called 
Cu(A), probably the primary acceptor in cytochrome c oxidase. An exception is the 
corresponding subunit of the cbb3-type oxidase which lacks the copper A redox-center. 
Several bacterial CO II have a C-terminal extension that contains a covalently bound heme c. 

It has been shown [3,4] that nitrous oxide reductase (EC 1.7.99.6) (gene nosZ) of 
Pseudomonas has sequence similarity in its C-terminus to CO II. This enzyme is part of the 
bacterial respiratory system which is activated under anaerobic conditions in the presence of 
nitrate or nitrous oxide. NosZ is a periplasmic homodimer that contains a dinuclear copper 
center, probably located in a 3- dimensional fold similar to the cupredoxin-like fold that has 
been suggested for the copper-binding site of CO II [3]. 
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The dinuclear purple copper center is formed by 2 histi dines and 2 cysteines [5]. This region 
was used as a signature pattern. The conserved valine and the conserved methionine are said 
to be involved in stabilizing the copper-binding fold by interacting with each other. 

Consensus pattern V-x-H-x(33,40)-C-x(3)-C-x{3)-H-x(2)-M [The two C's and two H's are 
copper ligands] Sequences known to belong to this class detected by the patternALL, except 
for Paramecium primaurelia as well as in some plants where the pattern ends with Thr; an 
RNA editing event at this position could change this Thr to Met. 

Note: cytochrome cbb(3) subunit 2 does not belong to this family. 

[ 1] Capaldi R.A., Malatesta F., Darley-Usmar V.M. Biochim. Biophys. Acta 726:135- 
148(1983). 

[ 2] Garcia-Horsman J.A., Barquera B., Rumbley J., Ma J., Gennis R.B. J. Bacteriol. 
176:5587-5600(1994). 

[ 3] van der Oost J., Lappalainen P., Musacchio A., Warne A., Lemieux L., Rumbley J., 
Gennis R.B., Aasa R., Pascher T., Malmstrom B.G., Saraste M. EMBO J. 11:3209- 
3217(1992). 

[ 4] Zumft W.G., Dreutsch A., LoecheU S., Cuypers H., Friedrich B., Schneider B. Eur. J. 
Biochem. 208:31-40(1992). 

928. Cytochrome C assembly protein (CytC_asm) 

This family consists of various proteins involved in cytochrome c assembly from 
mitochondria and bacteria; CycK from Rhizobium[3], CcmC from E. coli and Paracoccus 
denitrificans [2,1] and orf240 from wheat mitochondria [4]. The members of this family are 
probably integral membrane proteins with six predicted transmembrane helices. It has been 
proposed that members of this family comprise a membrane component of an ABC (ATP 
binding cassette) transporter complex. It is also proposed that this transporter is necessary for 
transport of some component needed for cytochrome c assembly. One member CycK 
contains a putative heme-binding motif [3], orf240 also contains a putative heme-binding 
motif and is a proposed ABC transporter with c-type heme as its proposed substrate [4]. 
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However it seems unlikely that all members of this family transport heme nor c-type 
apocytochromes because CcmC in the putative CcmABC transporter transports neither [1]. 
Number of members: 67 

[1] Page D, Pearce DA, Norris HA, Ferguson SJ; Medline: 97195802 "The Paracoccus 
denitrificans ccmA, B and C genes: cloning and sequencing, and analysis of the potential of 
their products to form a haem or apo-c-type cytochrome transporter. MICROBIOLOGY 
1997;143:563-576. 

[2] Thoeny-meyer L, Fischer F, Kunzler P, Ritz D, Hennecke H; Medline: 95362656 
"Escherichia coli genes required for cytochrome c maturation." J. BACTERIOL 
1995;177:4321-4326. 

[3] Delgado MJ, Yeoman KH, Wu G, Vargas C, Davies A, Poole RK, Johnston AWB, 
Downie JA; Medline: 95394794 "Characterization of the cycHJKL genes involved in 
cytochrome c biogenesis and symbiotic nitrogen fixation in Rhizobium leguminosarum." J. 
BACTERIOL 1995;177:4927-4934. 

[4] Bonnard G, Grienenberger JM; Medline: 95124303 "A gene proposed to encode a 
transmembrane domain of an ABC transporter is expressed in wheat mitochondria." MOL. 
GEN. GENET 1995;246:91-99. 

929. Cytochrome b559 subunits heme-binding site signature (cytochr_b559) 

Cytochrome b559 [1] is an essential component of photosystem II complex from oxygenic 
photosynthetic organisms. It is an integral thylakoid membrane protein composed of two 
subunits, alpha (gene psbE) and beta (gene psbF), each of which contains a histidine residue 
located in a transmembrane region. The two histidines coordinate the heme iron of 
cytochrome b559. 

The region around the heme-binding residue of both subunits is very similar and can be used 
as a signature pattern. 

Consensus pattern[LIV]-x-[ST]-[LIVF]-R-[FYW]-x(2)-[IV]-H-[STGA]-[LIV]- [STGA]- 
[IV] -P [H is the heme iron ligand] Sequences known to belong to this class detected by the 
patternALL. Other sequence(s) detected in SWISS-PROTNONE. 
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[ 1] Pakrasi H.B., de Ciechi P., Whitmarsh J. EMBO J. 10:1619-1627(1991). 
930. Cytochrome h/h6 signatures (Cytochrome_b) 

In the mitochondrion of eukaryotes and in aerobic prokaryotes, cytochrome b is a component 
of respiratory chain complex III (EC 1.10.2.2) - also known as the bcl complex or ubiquinol- 
cytochrome c reductase. In plant chloroplasts and cyanobacteria, there is a analogous protein, 
cytochrome b6, a component of the plastoquinone-plastocyanin reductase (EC 1.10.99.1), 
also known as the b6f complex. 

Cytochrome b/b6 [1,2] is an integral membrane protein of approximately 400 amino acid 
residues that probably has 8 transmembrane segments. In plants and cyanobacteria, 
cytochrome b6 consists of two subunits encoded by the petB and petD genes. The sequence 
of petB is colinear with the N-terminal part of mitochondrial cytochrome b, while petD 
corresponds to the C-terminal part. Cytochrome b/b6 non-covalently binds two heme groups, 
known as b562 and b566. Four conserved histidine residues are postulated to be the ligands 
of the iron atoms of these two heme groups. 

Apart from regions around some of the histidine heme ligands, there are a few conserved 
regions in the sequence of b/b6. The best conserved of these regions includes an invariant P- 
E-W triplet which lies in the loop that separates the fifth and sixth transmembrane segments. 
It seems to be important for electron transfer at the ubiquinone redox site - called Qz or Qo 
(where o stands for outside) - located on the outer side of the membrane. 

A schematic representation of the structure of cytochrome b/b6 is shown below. 

+— Fe-b562— -+ 1 +— Fe-b566-|-+ 1 1 1 1 

xxxxxxxxxxxHxHxxxxxxxxxxxxHxHxxxxxxxxxxPEWxxxxxxxxxxxxxxxxxx < 

— Cytochrome-b > <— -Cytochrome-b6-petB ><--Cytochrome- 

b6-petD > 
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Two signature patterns were developed for cytochrome hfh6. The first includes the first 
conserved histidine of b/b6, which is a heme b562 ligand; the second includes the conserved 
PEW triplet. 

Consensus pattern [DENQ]-x(3)-G-[FYWMQ]-x-[LIVMF]-R-x(2)-H [H is a heme b562 
ligand] Sequences known to belong to this class detected by the patternALL, except for 5 
sequences. 

Consensus pattern P-[DE]-W-[FY]-[LFY](2) Sequences known to belong to this class 
detected by the patternALL, except for Odocoileus hemionus (mule deer) and Paramecium 
tetraurelia cytochrome b. 

[ 1] Howell N. J. Mol. Evol. 29:157-169(1989). 

[ 2] Esposti M.D., de Vries S., Crimi M., Ghelli A., Patarnello T., Meyer A. Biochim. 
Biophys. Acta 1143:243-271(1993). 

931. Phorbol esters / diacylglycerol binding domain (DAG_PE-bind) 

Diacylglycerol (DAG) is an important second messenger. Phorbol esters (PE) are analogues 
of DAG and potent tumor promoters that cause a variety of physiological changes when 
administered to both cells and tissues. DAG activates a family of serine/threonine protein 
kinases, collectively known as protein kinase C (PKC) [1]. Phorbol esters can directly 
stimulate PKC. The N- terminal region of PKC, known as CI, has been shown [2] to bind PE 
and DAG in a phospholipid and zinc-dependent fashion. The CI region contains one or two 
copies (depending on the isozyme of PKC) of a cysteine-rich domain about 50 amino-acid 
residues long and essential for DAG/PE-binding. Such a domain has also been found in the 
following proteins: 

- Diacylglycerol kinase (EC 2.7.1.107) (DGK) [3], the enzyme that converts DAG into 
phosphatidate. It contains two copies of the DAG/PE-binding domain in its N-terminal 
section. At least five different forms of DGK are known in mammals. 

- N-chimaerin. A brain specific protein which shows sequence similarities with the BCR 
protein at its C-terminal part and contains a single copy of the DAG/PE-binding domain at its 
N-terminal part. It has been shown [4,5] to be able to bind phorbol esters. 
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- The raf/mil family of serine/threonine protein kinases. These protein kinases contain a 
single N-terminal copy of the DAG/PE-binding domain. 

- The unc-13 protein from Caenorhabditis elegans. Its function is not known but it contains a 
copy of the DAG/PE-binding domain in its central section and has been shown to bind 
specifically to a phorbol ester in the presence of calcium [6]. 

- The vav oncogene. Vav was generated by a genetic rearrangement during gene transfer 
assays. Its expression seems to be restricted to cells of hematopoeitic origin. Vav seems [5,7] 
to contain a DAG/PE-binding domain in the central part of the protein. 

- The Drosophila GTPase activating protein rotund. 

The DAG/PE-binding domain binds two zinc ions; the ligands of these metal ions are 
probably the six cysteines and two histidines that are conserved in this domain. A signature 
pattern was developed that spans completely the DAG/PE domain. 

Consensus pattern H-x-[LIVMFYW]-x(8,ll)-C-x(2)-C-x(3)-[LIVMFC]-x(5,10)- C-x(2)-C- 
x(4)-[HD]-x(2)-C-x(5,9)-C [All the C and H are involved in binding Zinc] Sequences known 
to belong to this class detected by the pattern ALL, except a few DGK's. 

[ 1] Azzi A., Boscoboinik D., Hensey C. Eur. J. Biochem. 208:547-557(1992). 

[ 2] Ono Y., Fujii T., Igarashi K., Kuno T., Tanaka C, Kikkawa U., Nishizuka Y. Proc. Natl. 

Acad. Sci. U.S.A. 86:4868-4871(1989). 

[ 3] Sakane F., Yamada K., Kanoh H., Yokoyama C., Tanabe T. Nature 344:345-348(1990). 
[ 4] Ahmed S., Kozma R., Monfries C., Hall C., Lim H.H., Smith P., Lim L. Biochem. J. 
272:767-773(1990). 

[ 5] Ahmed S., Kozma R., Lee J., Monfries C., Harden N., Lim L. Biochem. J. 280:233- 
241(1991). 

[ 6] Ahmed S., Maruyama I.N., Kozma R., Lee J., Brenner S., Lim L. Biochem. J. 287:995- 
999(1992). 

[ 7] Boguski M.S., Bairoch A., Attwood T.K., Michaels G.S. Nature 358:113-113(1992). 
932, 3-dehydroquinate synthase (DHQ_synthase) 
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[1] Barten R, Meyer TF; Medline: 98273626 "Cloning and characterisation of the Neisseria 
gonorrhoeae aroB gene." Mol Gen Genet 1998;258:34-44. 

[2] Hawkins AR, Lamb HK: Medline: 96048023 "The molecular biology of multidomain 
proteins. Selected examples." Eur J Biochem 1995;232:7-18. 

The 3-dehydroquinate synthase EC:4.6.1.3 domain is present in isolation in various bacterial 
3-dehydroquinate synthases and also present as a domain in the pentafunctional AROM 
polypeptide Swiss:P07547 [2]. 3-dehydroquinate (DHQ) synthase catalyses the formation of 
dehydroquinate (DHQ) and orthophosphate from 3-deoxy-D-arabino heptulosonic 7 
phosphate [1]. This reaction is part of the shikimate pathway which is involved in the 
biosynthesis of aromatic amino acids. 
Number of members: 25 

933. Dihydrofolate reductase signature (DiHfolate_red) 

Dihydrofolate reductases (EC 1.5.1.3) [1] are ubiquitous enzymes which catalyze the 
reduction of folic acid into tetrahydrofolic acid. They can be inhibited by a number of 
antagonists such as trimethroprim and methotrexate which are used as antibacterial or 
anticancerous agents. A signature pattern was derived from a region in the N-terminal part of 
these enzymes, which includes a conserved Pro-Trp dipeptide; the tryptophan has been 
shown [2] to be involved in the binding of substrate by the enzyme. 

Consensus pattern[LVAGC]-[LIF]-G-x(4)-[LIVMF]-P-W-x(4,5)-[DE]-x(3)-[FYIV]- 
x(3)-[STIQ] Sequences known to belong to this class detected by the patternALL, except for 

type II bacterial, plasmid-encoded, dihydrofolate reductases which do not belong to the same 

class of enzymes. 

[ 1] Harpers' Review of Biochemistry, Lange, Los Altos (1985). 

[ 2] Bolin J.T., Filman D.J., Matthews D.A., Hamlin R.C., Kraut J. J. Biol. Chem. 257:13650- 
13662(1982). 
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[1] Ponting CP; Medline: 95397417 "AF-6/cno: neither a kinesin nor a myosin, but a bit of 
both." Trends Biochem Sci 1995;20:265-266. 



Number of members: 31 

5 

935. (DNA_gyraseB_C) 

DNA topoisomerase II signature (cross-reference = TOPOISOMERASEJI) 

DNA topoisomerase I (EC 5.99.1.2) [1,2,3,4,E1] is one of the two types of enzyme that 
1 0 catalyze the interconversion of topological DNA isomers. Type II topoisomerases are ATP- 
dependent and act by passing a DNA segment through a transient double-strand break. 
Topoisomerase II is found in phages, archaebacteria, prokaryotes, eukaryotes, and in African 
Swine Fever virus (ASF). In bacteriophage T4 topoisomerase II consists of three subunits 
(the product of genes 39, 52 and 60). In prokaryotes and in archaebacteria the^enzyme, known 
15 as DNA gyrase, consists of two subunits (genes gyrA and gyrB [E2]). In some bacteria, a 
second type II topoisomerase has been identified; it is known as topoisomerase IV and is 
required for chromosome segregation, it also consists of two subunits (genes parC and parE). 
In eukaryotes, type II topoisomerase is a homodimer. 

2 0 There are many regions of sequence homology between the different subtypes of 

topoisomerase 11. The relation between the different subunits is shown in the following 
representation: 



< About-1400-residues > 

[ Protein 39-*— -][— -Protein 52—-] Phage T4 

[ gyrB * ][ gyrA ] Prokaryote II 

Archaebacteria 

[ parE * ][ parD ] Prokaryote IV 

[ ] Eukaryote and ASF 

'*': Position of the pattern. 
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As a signature pattern for this family of proteins, a region was selected that contains a highly 
conserved pentapeptide. The pattern is located in gyrB, in parE, and in protein 39 of phage 
T4 topoisomerase. 

Consensus pattern [LIVMA]-x-E-G-[DN]-S-A-x-[STAG] Sequences known to belong to this 
class detected by the pattern ALL. 

[ 1] Sternglanz R. Curr. Opin. Cell Biol. 1:533-535(1990). 

[ 2] Bjornsti M.-A. Curr. Opin. Struct. Biol. 1:99-103(1991). 

[ 3] Sharma A., Mondragon A. Curr. Opin. Struct. Biol. 5:39-47(1995). 

[ 4] Roca J. Trends Biochem. Sci. 20:156-160(1995). 

936. (DNAjopoisolIV) 

DNA topoisomerase II signature (cross-reference = TOPOISOMERASEJI) 

DNA topoisomerase I (EC 5.99.1.2) [1,2,3,4,E1] is one of the two types of enzyme that 
catalyze the interconversion of topological DNA isomers. Type II topoisomerases are ATP- 
dependent and act by passing a DNA segment through a transient double-strand break. 
Topoisomerase II is found in phages, archaebacteria, prokaryotes, eukaryotes, and in African 
Swine Fever virus (ASF). In bacteriophage T4 topoisomerase II consists of three subunits 
(the product of genes 39, 52 and 60). In prokaryotes and in archaebacteria the enzyme, known 
as DNA gyrase, consists of two subunits (genes gyrA and gyrB [E2]). In some bacteria, a 
second type II topoisomerase has been identified; it is known as topoisomerase IV and is 
required for chromosome segregation, it also consists of two subunits (genes parC and parE). 
In eukaryotes, type II topoisomerase is a homodimer. 

There are many regions of sequence homology between the different subtypes of 
topoisomerase II. The relation between the different subunits is shown in the following 
representation: 



<- 
[- 
[- 



About-1400-residues 

-Protein 39-* ][— -Protein 52— 

-gyrB *--—][ gyrA—- 



— > 

— ] Phage T4 

] Prokaryote II Archaebacteria 
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[ parE *—--][ parD ] Prokaryote IV 

[ * ] Eukaryote and ASF 

Position of the pattern. 

As a signature pattern for this family of proteins, a region was selected that contains a highly 
conserved pentapeptide. The pattern is located in gyrB, in parE, and in protein 39 of phage 
T4 topoisomerase. 

Consensus pattern [LIVMA]-x-E-G-[DN]-S-A-x-[STAG] Sequences known to belong to this 
class detected by the patternALL. 

[ 1] Sternglanz R. Curr. Opin. Cell Biol. 1:533-535(1990). 

[ 2] Bjornsti M.-A. Curr. Opin. Struct. Biol. 1:99-103(1991). 

[ 3] Sharma A., Mondragon A. Curr. Opin. Struct. Biol. 5:39-47(1995). 

[ 4] Roca J. Trends Biochem. Sci. 20:156-160(1995). 

937. Prolyl oligopeptidase family serine active site (DPPIV_N_term) 

The prolyl oligopeptidase family [1,2,3] consist of a number of evolutionary related 
peptidases whose catalytic activity seems to be provided by a charge relay system similar to 
that of the trypsin family of serine proteases, but which evolved by independent convergent 
evolution. The known members of this family are listed below. 

- Prolyl endopeptidase (EC 3.4.21.26) (PE) (also called post -proline cleaving enzyme). PE is 
an enzyme that cleaves peptide bonds on the C-terminal side of prolyl residues. The sequence 
of PE has been obtained from a mammalian species (pig) and from bacteria (Flavobacterium 
meningosepticum and Aeromonas hydrophila): there is a high degree of sequence 
conservation between these sequences. 

- Escherichia coli protease II (EC 3.4.21.83) (oligopeptidase B) (gene prtB) which cleaves 
peptide bonds on the C-terminal side of lysyl and argininyl residues. 

- Dipeptidyl peptidase IV (EC 3.4.14.5) (DPP IV). DPP IV is an enzyme that removes N- 
terminal dipeptides sequentially from polypeptides having unsubstituted N-termini provided 
that the penultimate residue is proline. 
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- Yeast vacuolar dipeptidyl aminopeptidase A (DPAP A) (gene: STE13) which is responsible 
for the proteolytic maturation of the alpha-factor precursor. 

- Yeast vacuolar dipeptidyl aminopeptidase B (DPAP B) (gene: DAP2). 

- Acylamino-acid-releasing enzyme (EC 3.4.19.1) (acyl-peptide hydrolase). This enzyme 
catalyzes the hydrolysis of the amino-terminal peptide bond of an N-acetylated protein to 
generate a N-acetylated amino acid and a protein with a free amino-terminus. 

A conserved serine residue has experimentally been shown (in E.coli protease II as well as in 
pig and bacterial PE) to be necessary for the catalytic mechanism. This serine, which is part 
of the catalytic triad (Ser, His, Asp), is generally located about 150 residues away from the C- 
terminal extremity of these enzymes (which are all proteins that contains about 700 to 800 
amino acids). 

Consensus pattern D-x(3)-A-x(3)-[LIVMFYW]-x(14)-G-x-S-x-G-G-[LIVMFYW](2) [S is 
the active site residue] Sequences known to belong to this class detected by the pattern ALL, 
except for yeast DPAP A. 

Note: these proteins belong to families S9A/S9B/S9C in the classification of peptidases 
[4,E1]. 

[ 1] Rawlings N.D., Polgar L., Barrett A.J. Biochem. J. 279:907-911(1991). 
[ 2] Barrett A.J., Rawlings N.D. Biol. Chem. Hoppe-Seyler 373:353-360(1992). 
[ 3] Polgar L., Szabo E. Biol. Chem. Hoppe-Seyler 373:361-366(1992). 
[ 4] Rawlings N.D., Barrett A.J. Meth. Enzymol. 244:19-61(1994). 

938. Deoxyhypusine synthase (DS) 

Eukaryotic initiation factor 5A (eIF-5A) contains an unusual amino acid, 
hypusine [N epsilon-(4-aminobutyl-2-hydroxy)lysine]. The first step in the 
post-translational formation of hypusine is catalysed by the enzyme 
deoxyhypusine synthase (DS) EC:1. 1.1.249. The modified version of eIF-5A, 
and DS, are required for eukaryotic cell proliferation [1]. 
Number of members: 9 
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[1] Liao DI, Wolff EC, Park MH, Davies DR; Medline: 98154315 "Crystal structure of the 
NAD complex of human deoxyhypusine synthase: an enzyme with a ball-and-chain 
mechanism for blocking the active site." Structure 1998;6:23-32. 



939. (DUF21) 

Many of the sequences in this family are annotated as hemolysins, however this is due to a 
similarity to Swiss:Q54318 that does not contain this domain. This domain is found in the N- 
terminus of the proteins adjacent to two intracellular CBS domains CBS. 
Number of members: 42 

940. (DUF59) 

This family includes prokaryotic proteins of unknown function. The family also includes 
PhaH Swiss:084984 from Pseudomonas putida. PhaH forms a complex with PhaF 
Swiss:084982, PhaG Swiss:084983 and Phal Swiss:084985, which hydroxylates 
phenylacetic acid to 2-hydroxyphenylacetic acid [1]. So members of this family may all be 
components of ring hydroxylating complexes. 
Number of members: 15 

[1] Olivera ER, Minambres B, Garcia B, Muniz C, Moreno MA, Ferrandez A, Diaz E, Garci 
JL, Luengo JM; Medline: 98263372 "Molecular characterization of the phenylacetic acid 
catabolic pathway in Pseudomonas putida U: the phenylacetyl-CoA catabolon." Proc Natl 
Acad Sci U S A 1998;95:6419-6424. 

941. (DUF82) 

The protein contains four conserved cysteines that may be involved in metal binding or 
disulphide bridges. 
Number of members: 4 
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942. Riboflavin kinase / FAD synthetase (FAD_Synth) 

This family consists part of the bifunctional enzyme riboflavin kinase / FAD synthetase. 
These enzymes have both ATP riboflavin 5'-phospho transferase and ATP:FMN- 
adenylyltransferase activitys [1]. They catalyse the 5'-phosphorylation of riboflavin to FMN 
and the adenylylation of FMN to FAD [1]. 

CAUTION: It is not clear if this region of the enzymes catalyses either or both of the 
enzymatic reactions. 
Number of members: 27 

[1] Manstein DJ, Pai EF; Medline: 87057286 "Purification and characterization of FAD 
synthetase from Brevibacterium ammoniagenes." J Biol Chem 1986;261:16169-16173. 

943. [2Fe-2S] binding domain (fer2_2) 

[1] Romao MJ, Archer M, Moura I, Moura JJ, LeGall J, Engh R, Schneider M, Hof P, Huber 
R; Medline: 96072968 "Crystal structure of the xanthine oxidase-related aldehyde oxido- 
reductase from D. gigas." Science 1995;270:1170-1176. 
Number of members: 53 

944. Filovirus glycoprotein (Filo_glycop) 

This family includes an extracellular region from the envelope glycoprotein of Ebola and 
Marburg viruses. This region is also produced as a separate transcript that gives rise to a non- 
structural, secreted glycoprotein, which is produced in large amounts and has an unknown 
function [1]. Processing of this protein may be involved in viral pathogenicity [2]. 
Number of members: 23 

[1] Volchkov VE, Feldmann H, Volchkova VA, Klenk HD; Medline: 98245155 "Processing 
of the Ebola virus glycoprotein by the proprotein convertase furin." Proc Natl Acad Sci U S 
A 1998:95:5762-5767. 
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[2] Sanchez A, Trappier SG, Mahy BW, Peters CJ, Nichol ST; Medline: 96195018 "The 
virion glycoproteins of Ebola viruses are encoded in two reading frames and are expressed 
through transcriptional editing." Proc Natl Acad Sci U S A 1996;93:3602-3607. 

945. Frataxin-like domain (Frataxin_Cyay) 

This family contains proteins that have a domain related to the globular C-terminus of 
Frataxin the protein that is mutated in Friedreich's ataxia. This domain is found in a family of 
bacterial proteins. The function of this domain is currently unknown. 
Number of members: 12 

[1] Gibson TJ, Koonin EV, Musco G, Pastore A, Bork P; Medline: 97084946 "Friedreich's 
ataxia protein: phylogenetic evidence for mitochondrial dysfunction." Trends Neurosci 
1996;19:465-468. 

946. (GAF) 

Domain present in phytochromes and cGMP-specific phosphodiesterases. 
Number of members: 296 

[1] Aravind L, Ponting CP; Medline: 98094688 "The GAF domain: an evolutionary link 
between diverse phototransducing proteins." Trends Biochem Sci 1997;22:458-459. 

947. Galaptin signature (Gal-bind_lectin) 

All vertebrates synthesize soluble galactoside-binding lectins [1,2,3] (also known as 
galectins, galaptins or S-lectin). These carbohydrate-binding proteins are developmentally 
regulated. Although their exact physiological role is not yet clear they seem to be involved in 
differentiation, cellular regulation and tissue construction. The sequence of galactoside- 
binding lectins from electric eel (electrolectin), conger eel (congerin), chicken and a number 
of mammalian species is known. These lectins are proteins of about 130 to 140 amino acid 
residues (14 Kd to 16 Kd). 
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A number of other proteins are known to belong to this family: 

- Galectin-3 (also known as MAC-2 antigen; CBP-35 or IgE-binding protein), a 35 Kd lectin 
which binds immunoglobulin E and which is composed of two domains: a N-terminal domain 
that consist of tandem repeats of a glycine/ proline-rich sequence and a C-terminal galaptin 
domain. 

- Galectin-4 [4], which is composed of two galaptin domains. 

- Galectin-5. 

- Galectin-7 [5], a keratinocyte protein which could be involved in cell-cell and/or cell- 
matrix interactions necessary for normal growth control. 

- Galectin-8 [6], which is composed of two galaptin domains. 

- Galectin-9 [7], which is composed of two galaptin domains. 

- Human eosinophil lysophospholipase (EC 3.1.1.5) [8] (Charcot-Leyden crystal protein), a 
protein that may have both an enzymatic and a lectin activities. It forms hexagonal 
bipyramidal crystals in tissues and secretions from sites of eosinophil-associated 
inflammation. 

- Caenorhabditis elegans 32 Kd lactose-binding lectin [9]. This lectin is composed of two 
galaptin domains. 

- Caenorhabditis elegans lec-7 and lec-8. 

One of the conserved regions of these lectins contains a tryptophan that has been shown [10] 
to be essential to the binding of galactosides. This region was used as a signature pattern for 
these proteins. 

Consensus patternW-[GEK]-x-[EQ]-x-[KRE]-x(3,6)-[PCTF]-[LIVMF]-[NQEGSKV]-x- 
[GH]-x(3)-[DENKHS]-[LIVMFC] [W binds carbohydrate] Sequences known to belong to 
this class detected by the pattern ALL, except for pig galectin 4. 

[ 1] Barondes S.H., Gitt M.A., Leffler H., Cooper D.N.W. Biochimie 70:1627-1632(1988). 
[ 2] Hirabayashi J., Kasai K.-I. J. Biochem. 104:1-4(1988). 

[ 3] Barondes S.H., Castronovo V., Cooper D.N.W., Cummings R.D., Drickamer K., Feizi 
T., Gitt M.A., Hirabayashi J., Hughes C, Kasai K.-L, Leffler H., Liu F.-T., Lotan R., 
Mercurio A.M., Monsigny M., Pillair S., Poirer F., Raz A., Rigby P.W.J., Rini J.M., Wang 
J.L. Cell 76:597-598(1994). 
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[ 4] Oda Y., Herrmann J., Gitt M., Turck C.W., Burlingame A.L., Barondes S.H., Leffler H. 
J. Biol. Chem. 268:5929-5939(1993). 

[ 5] Madsen P., Rasmussen H.H., Flint T., Gromov P., Kruse T.A., Honore B., Vorum H., 
Cells J.E. J. Biol. Chem. 270:5823-5829(1995). 
5 [ 6] Hadari Y.R., Paz K., Dekel R., Mestrovic T., Accili D., Zick Y. J. Biol. Chem. 270:3447- 
3453(1995). 

[ 7] Wada J., Kanwar Y.S. J. Biol. Chem. 272:6078-6086(1997). 
[ 8] Ackerman S.J., Corrette S.E., Rosenberg H.F., Bennett J.C., Mastrianni D.M., 
Nicholson-Weller A., Weller P.P., Chin D.T., Tenen D.G. J. Immunol. 150:456-468(1993). 
10 [9] Hirabayashi J., Satoh M., Kasai K.-I. J. Biol. Chem. 267: 15485-15490(1992). 
[10] Abbott W.M., Feizi T. J. Biol. Chem. 266:5552-5557(1991). 

948. (GARS) Phosphoribosylglycinamide synthetase signature (phosphoribosylamine glycine 
ligase) 

1 5 PROSITE: PDOC00164; cross-reference(s): PS00184 

[1] catalyzes the second step in the de novo biosynthesis of purine, the ATP-dependent 
addition of 5-phosphoribosylamine to glycine to form 5'phosphoribosylglycinamide. 

In bacteria GARS is a monofunctional enzyme (encoded by the purD gene), in of a 
2 0 bifunctional enzyme (encoded by the ADE5,7 gene), in higher eukaryotes it is part, with 
AIRS and with phosphoribosylglycinamide formyltransferase (GART) of a trifunctional 
enzyme (GARS- AIRS -GART). 

The sequence of GARS is well conserved. A highly conserved octapeptide was 
selected as a signature pattern. 

25 

Consensus patternR-F-G-D-P-E-x-[QM] 

Sequences known to belong to this class detected by the patternALL. 
[l]Aiba A., Mizobuchi K. J. Biol. Chem. 264:21239-21246(1989). 

30 

949. GLTT - GLTT repeat (12 copies) 
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This short repeat of unknown function is found in multiple copies in several C. elegans 
proteins. The repeat is five residues long and consists of XGLTT where X can be any amino 
acid. Number of members: 34. 

950. Glu_synthase - Conserved region in glutamate synthase 

This family represents a region of the glutamate synthase protein. This region is expressed as 
a seperate subunit in the glutamate synthase alpha subunit from archaebacteria, or part of a 
large multidomain enzyme in other organisms. The aligned region of these proteins contains a 
putative FMN binding site and Fe-S cluster. Number of members: 44. 

[1] Medline: 97082505. Sequence of the GLTl gene from Saccharomyces cerevisiae reveals 
the domain structure of yeast glutamate synthase. Filetici P, Martegani MP, Valenzuela L, 
Gonzalez A, Ballario P; Yeast 1996;12:1359-1366. 

951. (Glyco_hydro_2) Glycosyl hydrolases family 2 signatures 
GLYCOSYL_HYDROL_F2_l; PS00608; GLYCOSYL_HYDROL_F2_2 

It has been shown [1,2,E1] that the following glycosyl hydrolases can be, on the basis of 
sequence similarities, classified into a single family: 

-Beta-galactosidases (EC 3.2.1.23) from bacteria such as Escherichia coli (genes lacZ and 
ebgA), Clostridium acetobutylicum, Clostridium thermosulfurogenes, Klebsiella 
pneumoniae, Lactobacillus delbrueckii, or Streptococcus thermophilus and from the fungi 
Kluyveromyces lactis. 

-Beta-glucuronidase (EC 3.2.1.31) from Escherichia coli (gene uidA) and from mammals. 
One of the conserved regions in these enzymes is centered on a conserved glutamic acid 
residue which has been shown [3], in Escherichia coli lacZ, to be the general acid/base 
catalyst in the active site of the enzyme. This region has been used as a signature pattern. A 
highly conserved region located some sixty residues upstream from the active site glutamate 
has been selected as a second signature pattern. 

Consensus pattern N-x-[LIVMFYWD]-R-[STACN](2)-H-Y-P-x(4)-[LIVMFYWS](2)-x(3)- 
[DN]-x(2)-G-[LIVMFYW](4) Sequences known to belong to this class detected by the 
pattern ALL. 
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Consensus pattern [DENQLF]-[KRVW]-N-[HRY]-[STAPPV]-[SAC]-[LIVMFS](3)-W- 
[GS]-x(2,3)-N-E [E is the active site residue] Sequences known to belong to this class 
detected by the pattern ALL, except for Rhizobium meliloti lacZ. 

[l]Henrissat B. Biochem. J. 280:309-316(1991). 

[2]Schroeder C.J., Robert C, Lenzen G., McKay L.L., Mercenier A. J. Gen. Microbiol. 
137:369-380(1991). 

[3]Gebler J.C., Aebersold R., Withers S.G. J. Biol. Chem. 267:11126-11130(1992). 
952. (Glyco_hydro_3) Glycosyl hydrolases family 3 active site 

PROSITE: PDOC00621. PROSITE cross-reference(s)PS00775; GLYC0SYL_HYDR0L_F3 
It has been shown [1,2] that the following glycosyl hydrolases can be, on the basis of 
sequence similarities, classified into a single family: 

-Beta glucosidases (EC 3.2.1.21) from the fungi Aspergillus wentii (A-3), Hansenula 
anomala, Kluyveromyces fragilis, Saccharomycopsis fibuligera,(BGLl and BGL2), 
Schizophyllum commune and Trichoderma reesei (BGLl). 

-Beta glucosidases from the bacteria Agrobacterium tumefaciens (Cbgl), Butyrivibrio 

fibrisolvens (bglA), Clostridium thermocellum (bglB), Escherichia coli (bglX), Erwinia 

chrysanthemi (bgxA) and Ruminococcus albus. 

-Alteromonas strain 0-7 beta-hexosaminidase A (EC 3.2.1.52). 

-Bacillus subtilis hypothetical protein yzbA. 

-Escherichica coli hypothetical protein ycfO and HI0959, the corresponding Haemophilus 
influenzae protein. 

One of the conserved regions in these enzymes is centered on a conserved aspartic 
acid residue which has been shown [3], in Aspergillus wentii beta-glucosidase A3, to be 
implicated in the catalytic mechanism. This region was used as a signature pattern. 

Consensus pattern[LIVM](2)-[KR]-x-[EQK]-x(4)-G-[LIVMFT]-[LIVT]-[LIVMF]-[ST]-D- 
x(2)-[SGADNI] [D is the active site residue] 

Sequences known to belong to this class detected by the patternALL. 
[l]Henrissat B. Biochem. J. 280:309-316(1991). 

[2]Castle L.A., Smith K.D., Morris R.O. J. Bacteriol. 174:1478-1486(1992). 
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[3]Bause E., Legler G. Biochim. Biophys. Acta 626:459-465(1980). 

953. GP120 - Envelope glycoprotein GP120 

The entry of HIV requires interaction of viral GP120 with Swiss:P01730 and a chemokine 
receptor on the cell surface. Number of members: 17891 

[l]Medline: 98303379. Structure of an HIV gpl20 envelope glycoprotein in complex with 
the CD4 receptor and a neutralizing human antibody. Kwong PD, Wyatt R, Robinson J, 
Sweet RW, Sodroski J, Hendrickson WA; Nature 1998;393:648-659. 

954. (GSPII_E) Bacterial type II secretion system protein E signature 
PROSITE: PDOC00567. PROSITE cross-reference(s) PS00662; T2SP_E 

A number of bacterial proteins, some of which are involved in a general secretion 
pathway (GSP) for the export of proteins (also called the type II pathway) [1,2], have been 
found to be evolutionary related. These proteins are listed below: 

-The 'E' protein from the GSP operon of: Aeromonas (gene exeE); Erwinia (gene outE); 
Escherichia coli (gene yheG); Klebsiella pneumoniae (gene pulE); Pseudomonas aeruginosa 
(gene xcpR); Vibrio cholerae (gene epsE) and Xanthomonas campestris (gene xpsE). 
-Agrobacterium tumefaciens Ti plasmid virB operon protein 11. This protein is required for 
the transfer of T-DNA to plants. 

-Bacillus subtilis comG operon protein 1 which is required for the uptake of DNA by 
competent Bacillus subtilis cells. 

-Aeromonas hydrophila tapB, involved in type IV pilus assembly. 
-Pseudomonas protein pilB, which is essential for the formation of the pili. 
-Pseudomonas aeruginosa protein twitching mobility protein pilT. 
-Neisseria gonorrhoeae type IV pilus assembly protein pilF. 
-Vibrio cholerae protein tcpT, which is involved in the biosynthesis of the 
tcp pilus. 

-Escherichia coli protein hofB (hopB). 
-Escherichia coli hypothetical protein ygcB. 
-Escherichia coli hypothetical protein yggR. 

These proteins have from 344 (pilT and virBll) to 568 (tapB) amino acids, they are 
probably cytoplasmically located and, on the basis of the presence of a conserved P-loop 
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region (see <PDOC00017>), probably bind ATP. A region that overlaps the 'B' motif of 
ATP -binding proteins was selected as a signature pattern. 

Consensus pattern[LIVM]-R-x(2)-P-D-x-[LIVM](3)-G-E-[LIVM]-R-D 

Sequences known to belong to this class detected by the patternALL, except for ygcB. 

[l]Salmond G.P.C., Reeves P.J. Trends Biochem. Sci. 18:7-12(1993). 
[2]Hobbs M., Mattick J.S. Mol. Microbiol. 10:233-243(1993). 

955. (guanylate_cyc) Guanylate cyclases signature 

PROSITE: PDOC00425. PROSITE cross-reference(s) PS00452; 

GUANYLATE_CYCLASES Guanylate cyclases (EC 4.6.1.2) [1 to 4] catalyze the 

formation of cyclic GMP (cGMP) from GTP. cGMP acts as an intracellular messenger, 
activating cGMP dependent kinases and regulating CGMP-sensitive ion channels. The role of 
cGMP as a second messenger in vascular smooth muscle relaxation and retinal photo- 
transduction is well established. Guanylate cyclase is found both in the soluble and particular 
fraction of eukaryotic cells. The soluble and plasma membrane-bound forms differ in 
structure, regulation and other properties. 

Most currently known plasma membrane-bound forms are receptors for small 
polypeptides. The topology of such proteins is the following: they have a N-terminal 
extracellular domain which acts as the ligand binding region, then a transmembrane domain, 
followed by a large cytoplasmic C-terminal region that can be subdivided into two domains: a 
protein kinase-like domain that appears important for proper signalling and a cyclase catalytic 
domain. This topology is schematically represented below. 

+ xxxxx + + 

I Ligand-binding XXXXX Protein Kinase like | Cyclase | 

+ xxxxx + + 

Extracellular Transmembrane Cytoplasmic 

The known guanylate cyclase receptors are: 

-The sea-urchins receptors for speract and resact, which are small peptides that stimulate 
sperm motility and metabolism. 
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-The receptors for natriuretic peptides (ANF). Two forms of ANF receptors with guanylate 
cyclase activity are currently known: GC-A (or ANP-A) which seems specific to atrial 
natriuretic peptide (ANP), and GC-B (or ANP-B) which seems to be stimulated more 
effectively by brain natriuretic peptide (BNP) than by ANP. 

-The receptor for Escherichia coli heat-stable enterotoxin (GC-C). The endogenous ligand 
for this intestinal receptor seems to be a small peptide called guanylin. 
-Retinal guanylate cyclase (retGC) which probably plays a specific functional role in the 
rods and/or cones of photoreceptors. It is not known if this protein acts as receptor, but its 
structure is similar to that of the other plasma membrane -bound GCs. 

The soluble forms of guanylate cyclase are cytoplasmic heterodimers. The two 
subunits, alpha and beta are proteins of from 70 to 82 Kd which are highly related. Two 
forms of beta subunits are currently known: beta-1 which seems to be expressed in lung and 
brain, and beta-2 which is more abundant in kidney and liver. 

The membrane and cytoplasmic forms of guanylate cyclase share a conserved domain 
which is probably important for the catalytic activity of the enzyme. Such a domain is also 
found twice in the different forms of membrane-bound adenylate cyclases (also known as 
class-Ill) [5,6] from mammals, slime mold or Drosophila. A consensus pattern was derived 
from the most conserved region in that domain. 

Consensus patternG-V-[LIVM]-x(0,l)-G-x(5)-[FY]-x-[LIVM]-[FYW]-[GS]-[DNTHKW]- 
[DNT]-[IV]-[DNTA]-x(5)-[DE] 

Sequences known to belong to this class detected by the patternALL, except for the sea 

urchin Arbacia punctulata resact receptor which lack this domain. 

Note this pattern will detect both domains of adenylate cyclases class-Ill. 

[l]Koesling D., Boehme E., Schultz G. FASEB J. 5:2785-2791(1991). 
[2]Garbers D.L. New Biol. 2:499-504(1990). 
[3]Garbers D.L. Cell 71:1-4(1992). 

[4]Yuen P.S.T., Garbers D.L. Annu. Rev. Neurosci. 15:193-225(1992). 
[5]Iyengar R. FASEB J. 7:768-775(1993). 

[6]Barzu O., Danchin A. Prog. Nucleic Acid Res. Mol. Biol. 49:241-283(1994). 
956. Hemolysin-type calcium-binding region signature (HemolysinCabinD) 
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Gram-negative bacteria produce a number of proteins which are secreted into the growth 
medium by a mechanism that does not require a cleaved N-terminal signal sequence. These 
proteins, while having different functions, seem [1] to share two properties: they bind 
calcium and they contain a variable number of tandem repeats consisting of a nine amino acid 
motif rich in glycine, aspartic acid and asparagine. It has been shown [2] that such a domain 
is involved in the binding of calcium ions in a parallel beta roll structure. The proteins which 
are currently known to belong to this category are: 

- Hemolysins from various species of bacteria. Bacterial hemolysins are exotoxins that attack 
blood cell membranes and cause cell rupture. The hemolysins which are known to contain 
such a domain are those from: E. coli (gene hlyA), A. pleuropneumoniae (gene appA), A. 
actinomycetemcomitans and P. haemolytica (leukotoxin) (gene IktA). 

- Cyclolysin from Bordetella pertussis (gene cyaA). A multifunctional protein which is both 
an adenylate cyclase and a hemolysin. 

- Extracellular zinc proteases: serralysin (EC 3.4.24.40) from Serratia, prtB and prtC from 
Erwinia chrysanthemi and aprA from Pseudomonas aeruginosa. 

- Modulation protein nodO from Rhizobium leguminosarum. 

A signature pattern was derived from conserved positions in the sequence of the calcium- 
binding domain. 

Consensus pattern D-x-[LI]-x(4)-G-x-D-x-[LI]-x-G-G-x(3)-D Sequences known to belong to 
this class detected by the pattern ALL. 

Note: This pattern is found once in nodO and the extracellular proteases but up to 5 times in 
some hemolysin/cyclolysins. 

[ 1] Economou A., Hamilton W.D.O., Johnston A.W.B., Downie J.A. EMBO J. 9:349- 
354(1990). 

[ 2] Baumann U., Wu S., Flaherty K.M., McKay D.B. EMBO J. 12:3357-3364(1993). 
957. Hint module (Hint) 
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This is an alignment of the Hint module in the Hedgehog proteins. It does not include any 
Inteins which also possess the Hint module. 
Number of members: 36 

[1] Hall TM, Porter JA, Young KE, Koonin EV, Beachy PA, Leahy DJ; Medline: 97474313 
"Crystal structure of a Hedgehog autoprocessing domain: homology between Hedgehog and 
self-splicing proteins." Cell 1997;91:85-97. 

958. Hydantoinase/oxoprolinase (Hydantoinase) 

This family includes the enzymes hydantoinase and oxoprolinase EC:3. 5.2.9. Both reactions 
involve the hydrolysis of 5-membered rings via hydrolysis of their internal imide bonds [1]. 
Number of members: 14 

[1] Ye GJ, Breslow EB, Meister A, Guo-jie GE$[corrected to Ye GJ]; Medline: 97113037 
"The amino acid sequence of rat kidney 5-oxo-L-prolinase determined by cDNA cloning" 
[published erratum appears in J Biol Chem 1997 Feb 14;272(7):4646] J Biol Chem 
1996;271:32293-32300. 

959. IMP dehydrogenase / GMP reductase signature (IMPDH_N) 

IMP dehydrogenase (EC 1.1.1.205) (IMPDH) catalyzes the rate-limiting reaction of de novo 
GTP biosynthesis, the NAD-dependent reduction of IMP into XMP [1]. Inhibition of IMP 
dehydrogenase activity results in the cessation of DNA synthesis. As IMP dehydrogenase is 
associated with cell proliferation, it is a possible target for cancer chemotherapy. Mammalian 
and bacterial IMPDHs are tetramers of identical chains. There are two IMP dehydrogenase 
isozymes in humans [2]. 

GMP reductase (EC 1.6.6.8) catalyzes the irreversible and NADPH-dependent reductive 
deamination of GMP into IMP [3]. It converts nucleobase, nucleoside and nucleotide 
derivatives of G to A nucleotides, and maintains intracellular balance of A and G nucleotides. 
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IMP dehydrogenase and GMP reductase share many regions of sequence similarity. One of 
these regions is centered on a cysteine residue thought [3] to be involved in binding IMP. 
This region was used as a signature pattern. 

5 Consensus pattern[LIVM]-[RK]-[LIVM]-G-[LIVM]-G-x-G-S-[LIVM]-C-x-T [C is the 
putative IMP-binding residue] Sequences known to belong to this class detected by the 
pattern ALL. 

[ 1] Collart F.R., Huberman E. J. Biol. Chem. 263:15769-15772(1988). 
10 [2] Natsumeda Y., Ohno S., Kawasaki H., Konno Y., Weber G., Suzuki K. J. Biol. Chem. 
265:5292-5295(1990). 

[ 3] Andrews S.C., Guest J.R. Biochem, J. 255:35-43(1988). 

960. impB/mucB/samB family (IMS) 

15 

These proteins are involved in UV protection (Swiss). 
Number of members: 38 

961. Type II intron maturase (Intro n_maturas2) 

20 

Group II introns use intron-encoded reverse transcriptase, maturase and DNA endonuclease 
activities for site-specific insertion into DNA [2]. Although this type of intron is self splicing 
in vitro they require a maturase protein for 

splicing in vivo. It has been shown that a specific region of the aI2 intron is needed for the 
25 maturase function [1]. This region was found to be conserved in group II introns and called 
domain X [3]. 

Number of members: 335 

[1] Moran JV, Mecklenburg KL, Sass P, Belcher SM, Mahnke D, Lewin A, Perlman P; 
3 0 Medline: 94301788 "Splicing defective mutants of the COXI gene of yeast mitochondrial 

DNA: initial definition of the maturase domain of the group II intron aI2. Nucleic Acids Res 
1994;22:2057-2064. 
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[2] Guo H, Zimmerly S, Perlman PS, Lambowitz AM; Medline: 98031910 "Group II intron 
endonucleases use both RNA and protein subunits for recognition of specific sequences in 
double-stranded DNA." EMBO J 1997;16:6835-6848. 

[3] Mohr G, Perlman PS, Lambowitz AM; Medline: 94077696 "Evolutionary relationships 
among group II intron-encoded proteins and identification of a conserved domain that may be 
related to maturase function." Nucleic Acids Res 1993;21:4991-4997. 

962. LAGLIDADG endonuclease (Intron_maturase) 

[1] Heath PJ, Stephens KM, Monnat RJ Jr, Stoddard BL; Medline: 97331323 "The structure 
of I-Crel, a group I intron-encoded homing endonuclease." Nat Struct Biol 1997;4:468-476. 
[2] Belfort M, Roberts RJ; Medline: 97402526 "Homing endonucleases: keeping the house in 
order." Nucleic Acids Res 1997;25:3379-3388. 

[3] Dalgaard JZ, Klar AJ, Moser MJ, Holley WR, Chatterjee A, Mian IS; Medline: 98026854 
"Statistical modeling and analysis of the LAGLIDADG family of site-specific endonucleases 
and identification of an lutein that encodes a site-specific endonuclease of the HNH family." 
Nucleic Acids Res 1997;25:4626-4638. 

Number of members: 220 

963. Isopentenyl transferase (IPT) 

Isopentenyl transferase / dimethylallyl transferase synthesizes isopentenyladensosine 5'- 
monophosphate, a cytokinin that induces shoot formation on host plants infected with the Ti 
plasmid [1]. 

Number of members: 16 

[1] Canaday J, Gerad JC, Crouzet P, Otten L; Medline: 93101133 "Organization and 
functional analysis of three T-DNAs from the vitopine Ti plasmid pTiS4." Mol Gen Genet 
1992;235:292-303. 

964. Laminin EGF-like (Domains III and V) (laminin_EGF) 
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This family is like EGF but has 8 conserved cysteines instead of 6. 
Number of members: 501 

[1] Engel J; Medline: 93041759 "Laminins and other strange proteins." Biochemistry 
1992;31:10643-10651. 

965. Legume lectins signatures (lectinJegA) 

Leguminous plants synthesize sugar-binding proteins which are called legume lectins [1,2]. 
These lectins are generally found in the seeds. The exact function of legume lectins is not 
known but they may be involved in the attachment of nitrogen-fixing bacteria to legumes and 
in the protection against pathogens. Legume lectins bind calcium and manganese (or other 
transition metals). 

Legume lectins are synthesized as precursor proteins of about 230 to 260 amino acid 
residues. Some legume lectins are proteolytically processed to produce two chains: beta 
(which corresponds to the N-terminal) and alpha (C-terminal). The lectin concanavalin A 
(con A) from jack bean is exceptional in that the two chains are transposed and ligated (by 
formation of a new peptide bond). The N-terminus of mature conA thus corresponds to that 
of the alpha chain and the C-terminus to the beta chain. 

Two signature patterns were developed specific to legume lectins: the first is located in the C- 
terminal section of the beta chain and contains a conserved aspartic acid residue important for 
the binding of calcium and manganese; the second one is located in the N-terminal of the 
alpha chain. 

Consensus pattern [LIV]-[STAG]-V-[DEQV]-[FLI]-D-[ST] [D binds manganese and 
calcium] Sequences known to belong to this class detected by the pattern ALL. 

Consensus pattern [LIV]-x-[EDO]-[FYWKR]-V-x-[LIVF]-G-[LF]-[ST] Sequences known to 
belong to this class detected by the pattern ALL. 



[ 1] Sharon N., Lis H. FASEB J. 4:3198-320(1990). 
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[ 2] Lis H., Sharon N. Annu. Rev. Biochem. 55:33-37(1986). 

966. Malate synthase signature (malate_synthase) 

5 Malate synthase (EC 4.1.3.2) catalyzes the aldol condensation of glyoxylate with acetyl-CoA 
to form malate - the second step of the glyoxylate bypass, an alternative to the tricarboxylic 
acid cycle in bacteria, fungi and plants. Malate synthase is a protein of 530 to 570 amino 
acids whose sequence is highly conserved across species [1]. As a signature pattern, a very 
conserved region was selected in the central section of the enzyme. 

10 

Consensus pattern[KR]-[DENQ]-H-x(2)-G-L-N-x-G-x-W-D-Y-[LlVM]-F Sequences known 
to belong to this class detected by the pattern ALL. 

[ 1] Bruinenberg P.O., Blaauw M., Kazemier B., Ab G. Yeast 6:245-254(1990). 

15 

967. MatK/TrnK amino terminal region (MatK_N) 

[1] Mohr G, Perlman PS, Lambowitz AM; Medline: 94077696 "Evolutionary relationships 
among group II intron-encoded proteins and identification of a conserved domain that may be 
20 related to maturase function." Nucleic Acids Res 1993;21:4991-4997. 

Number of members: 495 

968. MOZ/SAS family (MOZ_SAS) 

25 

This region of these proteins has been suggested to be homologous to acetyltransferases [1]. 
However the similarity is not supported by standard sequence analysis. 
Number of members: 15 

30 [1] Kamine J, Elangovan B, Subramanian T, Coleman D, Chinnadurai G; Medline: 96182937 
"Identification of a cellular protein that specifically interacts with the essential cysteine 
region of the HIV-1 Tat transactivator." Virology 1996;216:357-366. 
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[2] Reifsnyder C, Lowell J, Clarke A, Pillus L; Medline: 96376969 "Yeast SAS silencing 
genes and human genes associated with AML and HIV-1 Tat interactions are homologous 
with acetyltransferases" [see comments] [published erratum appears in Nat Genet 1997 
May;16(l):109] Nat Genet 1996;14:42-49. 

969. mRNA capping enzyme (mRNA_cap_enzyme) 

[1] Hakansson K, Doherty AJ, Shuman S, Wigley DB; Medline: 97304383 "X-ray 
crystallography reveals a large conformational change during guanyl transfer by mRNA 
capping enzymes." Cell 1997;89:545-553. 

Number of members: 7 

970. DNA mismatch repair proteins mutS family signature (MutS_C) 

Mismatch repair contributes to the overall fidelity of DNA replication [1]. It involves the 
correction of mismatched base pairs that have been missed by the proofreading element of the 
DNA polymerase complex. The sequence of some proteins involved in mismatch repair in 
different organisms have been found to be evolutionary related [2,3]. One of these families is 
called mutS [4,E1], it consists of: 

- Prokaroytic protein mutS protein (also called hexA in Streptococcus pneumoniae). Muts is 
thought to carry out the mismatch recognition step of DNA repair. 

- Eukaryotic MSHl, which is involved in mitochondrial DNA repair. 

- Eukaryotic MSH2, which is involved in nuclear postreplication mismatch repair. MSH2 
heterodimerizes with MSH6. In man, MSH2 is involved in a form of familial hereditary 
nonpolyposis colon cancer (HNPCC). 

- Eukaryotic MSH3, which is probably involved in the repair of large loops. 

- Eukaryotic MSH4, which is involved in meiotic recombination. 

- Eukaryotic MSH5, which is involved in meiotic recombination. 

- Eukaryotic MSH6 (also known as G/T mismatch binding protein), a DNA-repair protein 
that binds to G/T mismatches through heterodimerization with MSH2. 

- Prokaryotic protein mutS2 whose function is not yet known. 

- A coral (Sarcophyton glaucum) mitochondrial encoded mutS-like protein. 
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As a signature pattern for this class of mismatch repair proteins a region rich in glycine and 
negatively charged residues was selected This region is found 

in the C-terminal section of these proteins; about 80 residues to the C-terminal of an ATP- 
binding site motif 'A' (P-loop) (see <PDOC00017>). 

Consensus pattern[ST]-[LIVMF]-x-[LIVM]-x-D-E-[LIVMFY]-[GC]-[RKH]-G-[GST]- x(4)- 
G Sequences known to belong to this class detected by the pattern ALL, except for mutS2. 

[ 1] Modrich P. Anna. Rev. Biochem. 56:435-466(1987). 

[ 2] Haber L.T., Walker G.C. EMBO J. 10:2707-2715(1991). 

[ 3] New L., Liu K., Grouse G.F. Mol. Gen. Genet. 239:97-108(1993). 

[ 4] Eisen J.A. Nucleic Acids Res. 26:4291-4300(1998). 

971. MutS family, N-terminal putative DNA binding domain (MutS_N) 

This family consists of the N-terminal region of proteins in the mutS family of DNA 
mismatch repair proteins and is found associated with MutS_C located in the C-terminal 
region. The mutS family of proteins is named after the salmonella typhimurium MutS protein 
involved in mismatch repair; other members of the family included the eukaryotic MSH 
1,2,3,4,5 and 6 proteins. These have various roles in DNA repair and recombination. Human 
MSH has been implicated in non-polyposis colorectal carcinoma (HNPCC) and is a 
mismatch binding protein [2]. The aligned region corresponds in part with domains Al, A2 
(which may bind DNA) and B (which binds dsDNA in vitro) from T. thermophilus MutS as 
characterised in [1]. 
Number of members: 43 

972. Domain in Myosin and Kinesin Tails (MyTH4) 

Domain present twice in myosin-VIIa, and also present in 3 other myosins. 

[1] Chen ZY, Hasson T, Kelley PM, Schwender BJ, Schwartz MF, Ramakrishnan M, 
Kimberling WJ, Mooseker MS, Corey DP; Medline: 97038686 "Molecular cloning and 
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domain structure of human myosin-VIIa, the gene product defective in Usher syndrome IB." 
Genomics 1996;36:440-448. 

Number of members: 21 

5 

973. Sodium and potassium ATPases beta subunits signatures (Na_K-ATPase) 

The sodium pump (Na+,K+ ATPase), located in the plasma membrane of all animal cells [1], 
is an heterotrimer of a catalytic subunit (alpha chain), a glycoprotein subunit of about 34 Kd 
1 0 (beta chain) and a small hydrophobic protein of about 6 Kd. The beta subunit seems [2] to 
regulate, through the assembly of alpha/beta heterodimers, the number of sodium pumps 
transported to the plasma membrane. 

Structurally the beta subunit is composed of a charged cytoplasmic domain of about 35 
15 residues, followed by a transmembrane region, and a large extracellular domain that contains 
three disulfide bonds and glycosylation sites. This structure is schematically represented in 
the figure below. 

+11 II 1 1 

xxxxxxxxxxxxxxxxxxxxxxxxCxxxxCxCxxCxxxxxxxCxxxxxxxxxxxCxxxx 
20 ******** <_Cyt-><TM> < Extracellular > 

'C: conserved cysteine involved in a disulfide bond. 
'*': position of the patterns. 

2 5 Two isoforms of the beta subunit (beta-1 and beta-2) are currently known; they share about 

50% sequence identity. Gastric (K+, H+) ATPase (proton pump) responsible for acid 
production in the stomach consist of two subunits [3]; the beta chain is highly similar to the 
sodium pump beta subunits. Two signature patterns were developed for beta subunits. The 
first is located in the cytoplasmic domain, while the second is found in the extracellular 

3 0 domain and contains two of the cysteines involved in disulfide bonds. 



Consensus pattern [FYW]-x(2)-[FYW]-x-[FYW]-[DN]-x(6)-[LIVM]-G-R-T-x(3)-W 
Sequences known to belong to this class detected by the pattern ALL. 
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Consensus pattern [RK]-x(2)-C-[RKQWI]-x(5)-L-x(2)-C-[SA]-G [The two Cs are involved 
in disulfide bonds] Sequences known to belong to this class detected by the patternALL, 
except for the beta subunit of the sodium pump of brine shrimp whose sequence is highly 
5 divergent in that region. 

[ 1] Horisberger J.D., Lemas V., Krahenbul J.P., Rossier B.C. Annu. Rev. Physiol. 53:565- 
584(1991). 

[ 2] McDonough A.A., Gerring K., Farley R.A. FASEB J. 4:1598-1605(1990). 
10 [3] Toh B.-H., Gleeson P.A., Simpson R.J., Moritz R.L., Callaghan J.M., Goldkorn I., Jones 
CM., Martinelli T.M., Mu F.-T., Humphris D.C., Pettitt J.M., Mori Y., Masuda T., 
Sobieszczuk P., Weinstock J., Mantamadiotis T., Baldwin G.S. Proc. Natl. Acad. Sci. U.S.A. 
87:6418-6422(1990). 

1 5 974. Respiratory-chain NADH dehydrogenase subunit 1 signatures (NADHdh) 

Respiratory -chain NADH dehydrogenase (EC 1.6.5.3) [1,2] (also known as complex I or 
NADH-ubiquinone oxidoreductase) is an oligomeric enzymatic complex located in the inner 
mitochondrial membrane which also seems to exist in the chloroplast and in cyanobacteria 
2 0 (as a NADH-plastoquinone oxidoreductase). Among the 25 to 30 polypeptide subunits of this 
bioenergetic enzyme complex there are fifteen which are located in the membrane part, seven 
of which are encoded by the mitochondrial and chloroplast genomes of most species. The 
most conserved of these organelle-encoded subunits is known as subunit 1 (gene NDl in 
mitochondrion, and NDHl in chloroplast) and seems to contain the ubiquinone binding site. 

25 

The NDl subunit is highly similar to subunit 4 of Escherichia coli formate hydrogenlyase 
(gene hycD), subunit C of hydrogenase-4 (gene hyfC). Paracoccus denitrificans NQ08 and 
Escherichia coli nuoH NADH-ubiquinone oxidoreductase subunits also belong to this family 
[3]. Two signature patterns were developed based on conserved regions of this subunit. 

30 

Consensus pattern G-[LIVMFYKRS]-[LIVMAGP]-Q-x-[LIVMFY]-x-D-[AGIM]- 
[LIVMFTA]- K-[LVMYST]-[LIVMFYG]-x-[KR]-[EQG] Sequences known to belong to this 
class detected by the patternALL, except for watermelon and Leishmania NDl. 
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Consensus pattern P-F-D-[LIVMFYQ]-[STAGPVM]-E-[GAC]-E-x-[EQ]-[LIVMS]-x(2)-G 
Sequences known to belong to this class detected by the pattern ALL, except for 
Chlamydomonas reinhardtii and Pisaster ochraceus NDl, and tobacco NDHl. 

[ 1] Ragan C.I. Curr. Top. Bioenerg. 15:1-36(1987). 

[ 2] Weiss H., Friedrich T., Hofhaus G., Preis D. Eur. J. Biochem. 197:563-576(1991). 
[ 3] Weidner U., Geier S., Ptock A., Friedrich T., Leif H., Weiss H. J. Mol. Biol. 233:109- 
122(1993). 

975. Nickel -dependent hydrogenases large subunit signatures (NiFeSe_Hases) 

Hydrogenases are enzymes that catalyze the reversible activation of hydrogen and which 
occur widely in prokaryotes as well as in some eukaryotes. There are various types of 
hydrogenases, but all of them seem to contain at least one iron-sulfur cluster. They can be 
broadly divided into two groups: hydrogenases containing nickel and, in some cases, also 
selenium (the [NiFe] and [NiFeSe] hydrogenases) and those lacking nickel (the [Fe] 
hydrogenases). 

The [NiFe] and [NiFeSe] hydrogenases are heterodimer that consist of a small subunit that 
contains a signal peptide and a large subunit. All the known large subunits seem to be 
evolutionary related [1]; they contain two Cys-x-x- Cys motifs; one at their N-terminal end; 
the other at their C-terminal end. These four cysteines are involved in the binding of nickel 
[2]. In the [NiFeSe] hydrogenases the first cysteine of the C-terminal motif is a 
selenocysteine which has experimentally been shown to be a nickel ligand [3]. Two patterns 
were developed which are centered on the Cys-x-x-Cys motifs. 

Alcaligenes eutrophus possess a NAD-reducing cytoplasmic hydrogenase (hoxS) [4]; this 
enzyme is composed of four subunits. Two of these subunits (beta and delta) are responsible 
for the hydrogenase reaction and are evolutionary related to the large and small subunits of 
membrane-bound hydrogenases. The alpha subunit of coenzyme F420 hydrogenase (EC 
1.12.99.1) (FRH) from archaebacterial methanogens also belongs to this family. 
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Consensus pattern R-G-[LIVMF]-E-x(15)-[QESM]-R-x-C-G-[LIVM]-C [The two C's are 
nickel ligands] Sequences known to belong to this class detected by the pattern ALL. 

Consensus pattern [FY]-D-P-C-[LIM]-[ASG]-C-x(2,3)-H [The two Cs are nickel ligands] 
Sequences known to belong to this class detected by the pattern ALL. 

[ 1] Menon N.K., Robbins J., Peck H.D. Jr., Chatelus C.Y., Choi E.-S., Przybyla A.E. J. 
Bacteriol. 172:1969-1977(1990). 

[ 2] Volbeda A., Charon M.-H., Piras C, Hatchikian E.C., Frey M., Fontecilla-Camps J.C. 
Nature 373:580-587(1995). 

[ 3] Eidsness M.K., Scott R.A., Prickrill B., der Vartaninan D.V., LeGall J., Moura I., Moura 

J.J.G., Peck H.D. Jr. Proc. Natl. Acad. Sci. U.S.A. 86:147-151(1989). 

[ 4] Tran-Betcke A., Warnecke U., Boecker C, Zaborosch C, Friedrich B. J. Bacteriol. 

172:2920-2929(1990). 

976. NADH-Ubiquinone oxidoreductase (complex I), chain 5 C-terminus (oxidored_ql_C) 

This sub-family represents a carboxyl terminal extension of oxidored_ql. Only NADH- 
Ubiquinone chain 5 from chloroplasts are in this family. This sub-family is part of complex I 
which catalyses the transfer of two electrons from NADH to ubiquinone in a reaction that is 
associated with proton translocation across the membrane. 
Number of members: 572 

[1] Walker JE; Medline: 93110040 "The NADH:ubiquinone oxidoreductase (complex I) of 
respiratory chains." Q Rev Biophys 1992;25:253-324. 

977. NADH-Ubiquinone oxidoreductase (complex I), chain 5 N-terminus (oxidored_ql_N) 

This sub-family represents an amino terminal extension of oxidored_ql. Only NADH- 
Ubiquinone chain 5 and eubacterial chain L are in this family. This sub-family is part of 
complex I which catalyses the transfer of two electrons from NADH to ubiquinone in a 
reaction that is associated with proton translocation across the membrane. 
Number of members: 546 
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[1] Walker JE; Medline: 93110040 "The NADH: ubiquinone oxidoreductase (complex I) of 
respiratory chains." Q Rev Biophys 1992;25:253-324. 

978. oxidored_q2. NADH-UBIQUINONE OXIDOREDUCTASE CHAIN 4L (EC 1.6.5.3). 
ND4L OR NAD4L. Arabidopsis thaliana (Mouse-ear cress). Mitochondrion. OC Eukaryota; 
Viridiplantae; Embryophyta; Tracheophyta; Spermatophyta; Magnoliophyta; eudicotyledons; 
Rosidae; eurosids II; Brassicales; Brassicaceae; Arabidopsis. 

CATALYTIC ACTIVITY: NADH + UBIQUINONE = NAD(+) + UBIQUINOL. 

[1] SEQUENCE FROM N.A. MEDLINE; 93156682. Brandt P., Sunkel S., Unseld M., 
Brennicke A., Knoop V.; "The nad4L gene is encoded between exon c of nadS and orf25 in 
the Arabidopsis mitochondrial genome."; Mol. Gen. Genet. 236:33-38(1992). 
[2] SEQUENCE FROM N.A. STRAIN=CV. COLUMBIA; MEDLINE; 97141919 Unseld 
M., Marienfeld J.R., Brandt P., Brennicke A.; "The mitochondrial genome of Arabidopsis 
thaliana contains 57 genes in 366,924 nucleotides."; Nat. Genet. 15:57-61(1997). 

979. oxidored_q4. Protein name NADH-PLASTOQUINONE OXIDOREDUCTASE CHAIN 
3, CHLOROPLAST. Synonym(s)EC 1.6.5.3. Gene name(s)NDHC OR NDH3 From Zea 
mays (Maize) Encoded on Chloroplast. Taxonomy Eukaryota; Viridiplantae; Embryophyta; 
Tracheophyta; Spermatophyta; Magnoliophyta; Liliopsida; Poales; Poaceae; Zea. 

CATALYTIC ACTIVITY: NADH + PLASTOQUINONE = NAD(+) + 
PLASTOQUINOL. 

SIMILARITY: BELONGS TO THE COMPLEX I SUBUNIT 3 FAMILY. 

[1] SEQUENCE FROM N.A. MEDLINE; 89281491. Steinmueller K., Ley A.C., Steinmetz 
A.A., Sayre R.T., Bogorad L.; "Characterization of the ndhC-psbG-ORF157/159 operon of 
maize plastid DNA and of the cyanobacterium Synechocystis sp. PCC6803."; Mol. Gen. 
Genet. 216:60-69(1989). 

[2] SEQUENCE FROM N.A. MEDLINE; 95395841. Maier R.M., Neckermann K., Igloi 
G.L., Koessel H.; "Complete sequence of the maize chloroplast genome: gene content, 
hotspots of divergence and fine tuning of genetic information by transcript editing."; J. Mol. 
Biol. 251:614-628(1995). 
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980. PAC: PAC motif 

PAC motif occurs C-terminal to a subset of all known PAS motifs. It is proposed to 
contribute to the PAS domain fold [3]. Number of members: 181 

[1] Medline: 97446881 PAS domain S-boxes in archaea, bacteria and sensors for oxygen and 
redox. Zhulin IB, Taylor BL, Dixon R; Trends Biochem Sci 1997;22:331-333. 
[2] Medline: 95275818. 1.4 A structure of photoactive yellow protein, a cytosolic 
photoreceptor: unusual fold, active site, and chromophore. Borgstahl GE, Williams DR, 
Getzoff ED; Biochemistry 1995;34:6278-6287. 

[3] Medline: 98044337. PAS: a multifunctional domain family comes to light. Ponting CP, 
Aravind L; Curr Biol 1997;7:674-677. 

981. PARP: Poly(ADP-ribose) polymerase catalytic region. 

Poly(ADP-ribose) polymerase catalyses the covalent attachment of ADP-ribose units from 
NAD+ to itself and to a limited number of other DNA binding proteins, which decreases their 
affinity for DNA. Poly(ADP-ribose) polymerase is a regulatory component induced by DNA 
damage. 

The carboxyl-terminal region is the most highly conserved region of the protein. Experiments 
have shown that a carboxyl 40 kDa fragment is still catalytically active [2]. Number of 
members: 19 

[1] Medline: 96353841 Structure of the catalytic fragment of poly(AD-ribose) polymerase 
from chicken. Ruf A, Mennissier de Murcia J, de Murcia G, Schulz GE; Proc Natl Acad Sci 
USA 1996;93:7481-7485. 

[2] Medline: 93293867 The carboxyl-terminal domain of human poly(ADP-ribose) 
polymerase. Overproduction in Escherichia coli, large scale purification, and 
characterization. Simonin F, Hofferer L, Panzeter PL, Muller S, de Murcia G, Althaus FR; J 
Biol Chem 1993;268:13454-13461. 

982. PC_rep: Proteasome/cyclosome repeat 
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[1] Medline: 97348748 A repetitive sequence in subunits of the 26S proteasome and 20S 
cyclosome (anaphase-promoting complex). Lupas A, Baumeister W, Hofmann K; Trends 
Biochem Sci 1997;22:195-196. 
Number of members: 112 

983. Peptidase_Ml: Peptidase family Ml 

Members of this family are aminopeptidases. The members differ widely in specificity, 
hydrolysing acidic, basic or neutral N-terminal residues. This family includes leukotriene-A4 
hydrolase Swiss:P09960, this enzyme also has an aminopeptidase activity [1]. Number of 
members: 72 

[1] Medline: 95405261 Evolutionary families of metallopeptidases. Rawlings ND, Barrett AJ; 
Meth Enzymol 1995;248:183-228. 

984. Neutral zinc metallopeptidases, zinc-binding region signature (Peptidase_M8) 
PROSITE cross-reference(s) PS00142; ZINC_PROTEASE 

The majority of zinc-dependent metallopeptidases (with the notable exception of the 
carboxypeptidases) share a common pattern of primary structure [1,2,3] in the part of their 
sequence involved in the binding of zinc, and can be grouped together as a 
superfamily,known as the metzincins, on the basis of this sequence similarity. They can be 
classified into a number of distinct families [4,E1] which are listed below along with the 
proteases which are currently known to belong to these families. 
Family Ml 

- Bacterial aminopeptidase N (EC 3.4.11.2) (gene pepN). 

- Mammalian aminopeptidase N (EC 3.4.11.2). 

- Mammalian glutamyl aminopeptidase (EC 3.4.11.7) (aminopeptidase A). It may play a 
role in regulating growth and differentiation of early B-lineage cells. 

- Yeast aminopeptidase yscll (gene APE2). 

- Yeast alanine/arginine aminopeptidase (gene AAPl). 

- Yeast hypothetical protein YIL137c. 



Reference No. 2750-942P 



788 

- Leukotriene A-4 hydrolase (EC 3.3.2.6). This enzyme is responsible for the hydrolysis of 
an epoxide moiety of LTA-4 to form LTB-4: it has been shown that it binds zinc and is 
capable of peptidase activity. 

Family M2 

- Angiotensin-converting enzyme (EC 3.4.15.1) (dipeptidyl carboxypeptidase I) (ACE) the 
enzyme responsible for hydrolyzing angiotensin I to angiotensin II. There are two forms 
of ACE: a testis-specific isozyme and a somatic isozyme which has two active centers. 
Family M3 

- Thimet oligopeptidase (EC 3.4.24.15), a mammalian enzyme involved in the cytoplasmic 
degradation of small peptides. 

- Neurolysin (EC 3.4.24.16) (also known as mitochondrial oligopeptidase M or microsomal 
endopeptidase). 

- Mitochondrial intermediate peptidase precursor (EC 3.4.24.59) (MIP). It is involved the 
second stage of processing of some proteins imported in the mitochondrion. 

- Yeast saccharolysin (EC 3.4.24.37) (proteinase yscD). 

- Escherichia coli and related bacteria dipeptidyl carboxypeptidase (EC 3.4.15.5) (gene 
dcp). 

- Escherichia coli and related bacteria oligopeptidase A (EC 3.4.24.70) (gene opdA or prlC). 

- Yeast hypothetical protein YKL134c. 
Family M4 

- Thermostable thermolysins (EC 3.4.24.27), and related thermolabile neutral proteases 
(bacillolysins) (EC 3.4.24.28) from various species of Bacillus. 

- Pseudolysin (EC 3.4.24.26) from Pseudomonas aeruginosa (gene lasB). 

- Extracellular elastase from Staphylococcus epidermidis. 

- Extracellular protease prtl from Erwinia carotovora. 

- Extracellular minor protease smp from Serratia marcescens. 

- Vibriolysin (EC 3.4.24.25) from various species of Vibrio. 

- Protease prtA from Listeria monocytogenes. 

- Extracellular proteinase proA from Legionella pneumophila. 

Family M5 

- Mycolysin (EC 3.4.24.31) from Streptomyces cacaoi. 
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Family M6 

- Immune inhibitor A from Bacillus thuringiensis (gene ina). Ina degrades two classes of 
insect antibacterial proteins, attacins and cecropins. 

Family M7 

- Streptomyces extracellular small neutral proteases 
Family M8 

- Leishmanolysin (EC 3.4.24.36) (surface glycoprotein gp63), a cell surface protease from 
various species of Leishmania. 

Family M9 

- Microbial collagenase (EC 3.4.24.3) from Clostridium perfringens and Vibrio 
alginolyticus. 

Family MlOA 

- Serralysin (EC 3.4.24.40), an extracellular metalloprotease from Serratia. 

- Alkaline metalloproteinase from Pseudomonas aeruginosa (gene aprA). 

- Secreted proteases A, B, C and G from Erwinia chrysanthemi. 

- Yeast hypothetical protein YILlOSw. 

Family MlOB 

- Mammalian extracellular matrix metalloproteinases (known as matrixins) [5]: MMP-1 (EC 
3.4.24.7) (interstitial collagenase), MMP-2 (EC 3.4.24.24) (72 Kd gelatinase), MMP-9 (EC 
3.4.24.35) (92 Kd gelatinase), MMP-7 (EC 3.4.24.23) (matrylisin), MMP-8 (EC 3.4.24.34) 
(neutrophil collagenase), MMP -3 (EC 3.4.24.17) (stromelysin-1), MMP-10 (EC 3.4.24.22) 
(stromelysin-2), and MMP-11 (stromelysin-3), MMP-12 (EC 3.4.24.65) (macrophage 
metalloelastase). 

- Sea urchin hatching enzyme (envelysin) (EC 3.4.24.12). A proteas that allows the 
embryo to digest the protective envelope derived from the egg extracellular matrix. 

- Soybean metalloendoproteinase 1. 



Family Mil 
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- Chlamydomonas reinhardtii gamete lytic enzyme (GLE). 
Family M12A 

- Astacin (EC 3.4.24.21), a crayfish endopro tease. 

- Meprin A (EC 3.4.24.18), a mammalian kidney and intestinal brush border 
metalloendopeptidase. 

- Bone morphogenic protein 1 (BMP-1), a protein which induces cartilage and bone 
formation and which expresses metalloendopeptidase activity. The Drosophila homolog 
of BMP-1 is the dorsal-ventral patterning protein tolloid. 

- Blastula protease 10 (BPIO) from Paracentrotus lividus and the related protein SpAN 
from Strongylocentrotus purpuratus. 

- Caenorhabditis elegans protein toh-2. 

- Caenorhabditis elegans hypothetical protein F42A10.8. 

- Choriolysins L and H (EC 3.4.24.67) (also known as embryonic hatching proteins LCE 
and HCE) from the fish Oryzias lapides. These proteases participates in the breakdown 
of the egg envelope, which is derived from the egg extracellular matrix, at the time of 
hatching. 

Family M12B 

- Snake venom metalloproteinases [6]. This subfamily mostly groups proteases that act in 
hemorrhage. Examples are: adamalysin II (EC 3.4.24.46), atrolysin C/D (EC 
3.4.24.42), atrolysin E (EC 3.4.24.44), fibrolase (EC 3.4.24.72), trimerelysin I (EC 
3.4.25.52) and II (EC 3.4.25.53). 

- Mouse cell surface antigen MS2. 

Family M13 

- Mammalian neprilysin (EC 3.4.24.11) (neutral endopeptidase) (NEP). 

- Endothelin-converting enzyme 1 (EC 3.4.24.71) (ECE-1), which process the precursor of 
endothelin to release the active peptide. 

- Kell blood group glycoprotein, a major antigenic protein of erythrocytes. The Kell protein 
is very probably a zinc endopeptidase. 

- Peptidase O from Lactococcus lactis (gene pepO). 
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Family M27 

- Clostridial neurotoxins, including tetanus toxin (TeTx) and the various botulinum toxins 
(BoNT). These toxins are zinc proteases that block neurotransmitter release by 
proteolytic cleavage of synaptic proteins such as synaptobrevins, syntaxin and SNAP-25 
[7,8]. 

Family M30 

- Staphylococcus hyicus neutral metalloprotease. 
Family M32 

- Thermostable carboxypeptidase 1 (EC 3.4.17T9) (carboxypeptidase Taq), an enzyme 
from Thermus aquaticus which is most active at high temperature. 

Family M34 

-Lethal factor (LF) from Bacillus anthracis, one of the three proteins composing the 
anthrax toxin. 

Family M35 

- Deuterolysin (EC 3.4.24.39) from Penicillium citrinum and related proteases from various 
species of Aspergillus. 

Family M36 

- Extracellular elastinolytic metalloproteinases from Aspergillus. 

From the tertiary structure of thermolysin, the position of the residues acting as zinc 
ligands and those involved in the catalytic activity are known. Two of the zinc ligands are 
histidines which are very close together in the sequence; C-terminal to the first histidine is 
a glutamic acid residue which acts as a nucleophile and promotes the attack of a water 
molecule on the carbonyl carbon of the substrate. A signature pattern which includes the 
two histidine and the glutamic acid residues is sufficient to detect this superfamily of 
proteins. 
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Consensus pattern[GSTALIVN]-x(2)-H-E-[LIVMFYW]-{DEHRKP}-H-x- 

[LIVMFYWGSPQ] 

[The two H's are zinc ligands] [E is the active site residue] 

Sequences known to belong to this class detected by the patternALL, except 

for members of families M5, M7 amd Mil. 

Other sequence(s) detected in SWISS-PROT57; including Neurospora crassa 

conidiation-specific protein 13 which could be a zinc-protease. 
[IJJongeneel C.V., Bouvier J., Bairoch A. FEES Lett. 242:211-214(1989). 
[2]Murphy G.J.P., Murphy G., Reynolds J.J. FEBS Lett. 289:4-7(1991). 
[3]Bode W., Grams F., Reinemer P., Gomis-Rueth F.-X., Baumann U., McKay D.B., 
Stoecker W. Zoology 99:237-246(1996). 

[4]Rawlings N.D., Barrett A.J. Meth. Enzymol. 248:183-228(1995). 
[5]Woessner J. Jr. FASEB J. 5:2145-2154(1991). 

[6]Hite L.A., Fox J.W., Bjarnason J.B. Biol. Chem. Hoppe-Seyler 373:381-385(1992). 
[7]Montecucco C, Schiavo G. Trends Biochem. Sci. 18:324-327(1993). 
[8]Niemann H., Blasi J., Jahn R. Trends Cell Biol. 4:179-185(1994). 

985. PH04: Phosphate transporter family 

This family includes PHO-4 from Neurospora crassa which is a is a Na(-i-)-phosphate 
symporter [1]. This family also contains the leukemia virus receptor Swiss:Q08344. Number 
of members: 41 

[1] Medline: 95249577 Repressible cation-phosphate symporters in Neurospora crassa. 
Versaw WK, Metzenberg RL; Proc Natl Acad Sci U S A 1995;92:3884-3887. 

986. Photosynthetic reaction center proteins signature (photoRC) 
PROSITE cross-reference(s): PS00244; REACTION_CENTER 

In the photosynthetic reaction center of purple bacteria, two homologous integral 
membrane proteins, L(ight) and M(edium), are known to be essential to the light-mediated 
water-splitting process. In the photosystem II of eukaryotic chloroplasts two related 
proteins are involved: the Dl (psbA) and D2 proteins (psbD). These four types of protein 
probably evolved from a common ancestor [see 1,2 for recent reviews]. 
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A signature pattern was developed which include two conserved histidine residues. In L 
and M chains, the first histidine is a ligand of the magnesium ion of the special pair 
bacteriochlorophyll, the second is a ligand of a ferrous non-heme iron atom. In photosystem 
II these two histidines are thought to play a similar role. 

Consensus pattern[NQH]-x(4)-P-x-H-x(2)-[SAG]-x(ll)-[SAGC]-x-H-[SAG](2) 
[The first H is a magnesium ligand] [The second H is a iron ligand] 
Sequences known to belong to this class detected by the patternALL, except 
for broad bean psbA which has Gin instead of the second His. 

[l]Michel H., Deisenhofer J. Biochemistry 27:1-7(1988). 
[2]Barber J. Trends Biochem. Sci. 12:321-326(1987). 

987. phytochrome: Phytochrome region 

This family contains a region specific to phytochrome proteins. Number of members: 
145 

988. PI3K_C2: C2 domain 

Phosphoinositide 3-kinase region postulated to contain a C2 domain. Outlier of C2 family. 
Number of members: 39 

[1] Medline: 97388296 Using structure to define the function of phosphoinositide 3-kinase 

family members. Domin J, Waterfield MD; FEES Lett 1997;410:91-95. 

[2] Medline: 97398940 Phosphoinositide 3-kinases: a conserved family of signal transducers. 

Vanhaesebroeck B, Leevers SJ, Panayotou G, Waterfield MD; Trends Biochem Sci 

1997;22:267-272. 

989. PI3Ka: Phosphoinositide 3-kinase family, accessory domain (PIK domain) 
PIK domain is conserved in all PI3 and PI4-kinases. Its role is unclear but it has been 
suggested [2] to be involved in substrate presentation. 

Number of members: 47 
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[1] Medline: 97388296 Using structure to define the function of phosphoinositide 3-kinase 
family members. Domin J, Waterfield MD; FEES Lett 1997;410:91-95. 
[2] Medline: 94069320 Phosphatidylinositol 4-kinase: gene structure and requirement for 
yeast cell viability. Flanagan CA, Schnieders EA, Emerick AW, Kunisawa R, Admon A, 
Thorner J; Science 1993;262:1444-1448. 

990. P-II protein signatures 

PROSITE cross-reference(s): PS00496; PII_GLNB_UMP, PS00638; PII_GLNB_CTER 

The P-II protein (gene glnB) is a bacterial protein important for the control of glutamine 
synthetase [1,2,3]. In nitrogen-limiting conditions, when the ratio of glutamine to 2- 
ketoglutarate decreases, P-II is uridylylated on a tyrosine residue to form P-II-UMP. P-II- 
UMP allows the deadenylation of glutamine synthetase (GS), thus activating the enzyme. 
Conversely, in nitrogen excess, P-II-UMP is deuridylated and then promotes the adenylation 
of GS. P-II also indirectly controls the transcription of the GS gene (glnA) by preventing NR- 
II (ntrB) to phosphorylate NR-I (ntrC) which is the transcriptional activator of glnA. 
Once P-II is uridylylated, these events are reversed. 

P-II is a protein of about 110 amino acid residues extremely well conserved. The tyrosine 
which is urydylated is located in the central part of the protein. 

In cyanobacteria, P-II seems to be phosphorylated on a serine residue rather than being 
urydylated. 

In methanogenic archaebacteria, the nitrogenase iron protein gene (nifH) is followed by two 
open reading frames highly similar to the eubacterial P-II protein [4]. These proteins could 
be involved in the regulation of nitrogen fixation. 

In the red alga, Porphyra purpurea, there is a glnB homolog encoded in the chloroplast 
genome. 

Other proteins highly similar to glnB are: 
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- Bacillus subtilis protein nrgB [5]. 

- Escherichia coli hypothetical protein ybal [6]. 

Two signature patterns were developed for P-II protein. The first one is a conserved 
stretch (in eubacteria) of six residues which contains the urydylated tyrosine, the other 
is derived from a conserved region in the C-terminal part of the P-Il protein. 

Consensus pattern Y-[KR]-G-[AS]-[AE]-Y [The second Y is uridylated] 
Sequences known to belong to this class detected by the patternALL glnB's 
from eubacteria. 

Consensus pattern[ST]-x(3)-G-[DY]-G-[KR]-[IV]-[FW]-[LIVM]-x(2)-[LIVM] 

[l]Magasanik B. Biochimie 71:1005-1012(1989). 

[2]Holtel A., Merrick M. Mol. Gen. Genet. 215:134-138(1988). 

[3]Cheah E., Carr P.D., Suffolk P.M., Vasuvedan S.G., Dixon N.E., OUis D.L. Structure 
2:981-990(1994). 

[4]Sibold L., Henriquet M., Possot O., Aubert J. -P. Res. Microbiol. 142:5-12(1991). 
[5]Wray L.V. Jr., Atkinson M.R., Fisher S.H. J. Bacteriol. 176:108-114(1994). 
[6]Allikmets R., Gerrard B.C., Court D., Dean M.C. Gene 136:231-236(1993). 

991. PIP5K: Phosphatidylinositol-4-phosphate 5-Kinase 

This family contains a region from the common kinase core found in the type I 
phosphatidylinositol-4-phosphate 5-kinase (PIP5K) family as described in [1]. The family 
consists of various type I, II and III PIP5K enzymes. PIP5K catalyses the formation of 
phosphoinositol-4,5-bisphosphate via the phosphorylation of phosphatidylinositol-4- 
phosphate a precursor in the phosphinositide signaling pathway. Number of members: 33 

[1] Medline: 98204859. Type I phosphatidylinositol-4-phosphate 5-kinases. Cloning of the 
third isoform and deletion/substitution analysis of members of this novel lipid kinase family. 
Ishihara H, Shibasaki Y, Kizuki N, Wada T, Yazaki Y, Asano T, Oka Y; J Biol Chem 
1998;273:8741-8748. 
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[2] Medline: 97115834 Type I phosphatidylinositol-4-phosphate 5-kinases are distinct 
members of this novel lipid kinase family. Loijens JC, Anderson RA; J Biol Chem 1996 
20;271:32937-32943. 

992. PolyA_pol: Poly A polymerase family 

This family includes nucleic acid independent RNA polymerases, such as Poly(A) 
polymerase, which adds the poly (A) tail to mRNA EC:2.7.7.19. This family also includes the 
tRNA nucleotidyltransferase that adds the CCA to the 3' of the tRNA 
EC:2.7.7.25. Number of members: 31 

[1] Medline: 93066242 Identification of the gene for an Escherichia coli poly(A) polymerase. 
Cao GJ, Sarkar N; Proc Natl Acad Sci U S A 1992;89:10380-10384. 

993. Photosystem I psaA and psaB proteins signature (psaA_psaB) 
PROSITE cross-reference(s)PS00419; PHOTOSYSTEM_I_PSAAB 

Photosystem I (PSI) [1] is an integral membrane protein complex that uses light energy to 
mediate electron transfer from plastocyanin to ferredoxin. PSI is found in the chloroplast 
of plants and cyanobacteria. The electron transfer components of the reaction center of 
PSI are a primary electron donor P-700 (chlorophyll dimer) and five electron acceptors: AO 
(chlorophyll), Al (a phylloquinone) and three 4Fe-4S iron-sulfur centers: Fx, Fa, and Fb. 

PsaA and psaB, two closely related proteins, are involved in the binding of P700, AO, Al, 
and Fx. psaA and psaB are both integral membrane proteins of 730 to 750 amino acids that 
seem to contain 11 transmembrane segments. The Fx 4Fe-4S iron-sulfur center is bound by 
four cysteines; two of these cysteines are provided by the psaA protein and the two others 
by psaB. The two cysteines in both proteins are proximal and located in a loop between 
the ninth and tenth transmembrane segments. A leucine zipper motif seems to be present [2] 
downstream of the cysteines and could contribute to dimerization of psaA/psaB. 

The signature pattern for these proteins is based on the perfectly conserved region that 
includes the two iron-sulfur binding cysteines. 

Consensus patternC-D-G-P-G-R-G-G-T-C [The two C's bind the iron-sulfur center] 
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[IJGolbeck J.H. Biochim. Biophys. Acta 895:167-204(1987). 
[ 2]Webber A.N., Malkin R. FEES Lett. 264:1-14(1990). 

994. PSBH: Photosystem II 10 kDa phosphoprotein 

This protein is phosphorylated in a light dependent reaction. 
Number of members: 20 

995. PsbJ 

This family consists of the photosystem II reaction center protein PsbJ from plants and 
Cyanobacteria. In Synechocystis sp. PCC 6803 PsbJ regulates the number of photosystem II 
centers in thylakoid membranes, it is a predicted 4kDa protein with one membrane spanning 
domain [1]. Number of members: 20 

[1] Medline: 93131892. Genetic and immunological analyses of the cyanobacterium 
Synechocystis sp. PCC 6803 show that the protein encoded by the psbJ gene regulates the 
number of photosystem II centers in thylakoid membranes. Lind LK, Shukla VK, Nyhus KJ, 
Pakrasi HB; J Biol Chem 1993;268:1575-1579. 

996. PSBT: Photosystem II reaction centre T protein 

The exact function of this protein is unknown. It probably consists of a single transmembrane 
spanning helix. The Swiss:P37256 protein, appears to be (i) a novel photosystem II subunit 
and (ii) required for maintaining optimal photosystem II activity under adverse growth 
conditions [1]. Number of members: 17 

[1] Medline: 94298765. The chloroplast ycf8 open reading frame encodes a 
photosystem II polypeptide which maintains photosynthetic activity under adverse growth 
conditions. Monod C, Takahashi Y, Goldschmidt-Clermont M, Rochaix JD; EMBO J 
1994;13:2747-2754. 

997. PSI_8. PHOTOSYSTEM I REACTION CENTRE SUBUNIT VIII. Synonym(s)PSI-I. 
Gene name(s)PSAI. From Hordeum vulgare (Barley). Encoded on Chloroplast. Taxonomy 
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Eukaryota; Viridiplantae; Embryophyta; Tracheophyta; Spermatophyta; Magnoliophyta; 
Liliopsida; Poales; Poaceae; Hordeum. 

MAY HELP IN THE ORGANIZATION OF THE PSAL SUBUNIT. BELONGS TO THE 
PSAI FAMILY. 

[1] SEQUENCE FROM N.A. MEDLINE; 90036933. Scheller H.V., Okkels J.S., Hoej P.B., 
Svendsen I., Roepstorff P., Moeller B.L.; "The primary structure of a 4.0-kDa photosystem I 
polypeptide encoded by the chloroplast psal gene."; J. Biol. Chem. 264:18402-18406(1989). 

998. PSI_PsaJ: Photosystem I reaction centre subunit IX / PsaJ 

This family consists of the photosystem I reaction centre subunit IX or PsaJ from various 
organisms including Synechocystis sp. (strain pec 6803), Pinus thunbergii (green pine) and 
Zea mays (maize). PsaJ Swiss:P19443 is a small 4.4kDa, chloroplastal encoded, hydrophobic 
subunit of the photosystem I reaction complex its function is not yet fully understood [1]. 
PsaJ can be cross-linked to PsaF Swiss:P12356 and has a single predicted transmembrane 
domain it has a proposed role in maintaing PsaF in the correct orientation to allow for fast 
electron transfer from soluble donor proteins to P700+ [1]. Number of members: 18 

[1] Medline: 99238330. A large fraction of PsaF is nonfunctional in photosystem I complexes 
lacking the PsaJ subunit. Fischer N, Boudreau E, Hippler M, Drepper F, Haehnel W, Rochaix 
JD; Biochemistry 1999;38:5546-5552. 

[2] Medline: 93252282. Genes encoding eleven subunits of photosystem I from the 
thermophilic cyanobacterium Synechococcus sp. Muhlenhoff U, Haehnel W, Witt H, 
Herrmann RG; Gene 1993;127:71-78. 

999. PSII. Protein namePHOTOSYSTEM II P680 CHLOROPHYLL A APOPROTEIN. 
Synonym(s)CP-47 PROTEIN. Gene name(s)PSBB. From Hordeum vulgare (Barley), 
Encoded on Chloroplast. Taxonomy Eukaryota; Viridiplantae; Embryophyta; Tracheophyta; 
Spermatophyta; Magnoliophyta; Liliopsida; Poales; Poaceae; Hordeum. 

FUNCTION: THIS PROTEIN CONJUGATES WITH CHLOROPHYLL & 
CATALYZES THE PRIMARY LIGHT-INDUCED PHOTOCHEMICAL PROCESSES OF 
PHOTOSYSTEM II. SUBCELLULAR LOCATION: CHLOROPLAST THYLAKOID 
MEMBRANE. SIMILARITY: BELONGS TO THE PSBB / PSBC FAMILY. 
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[1] SEQUENCE FROM N.A. STRAIN=CV. SABARLIS; MEDLINE; 89240047. Andreeva 
A.V., Buryakova A. A., Reverdatto S.V., Chakhmakhcheva O.G., Efimov V.A.; "Nucleotide 
sequence of the 5.2 kbp barley chloroplast DNA fragment, containing psbB-psbH-petB-petD 
gene cluster."; Nucleic Acids Res. 17:2859-2860(1989). 

[2] SEQUENCE FROM N.A. STRAIN=CV. SABARLIS; MEDLINE; 92207253. Efimov 
V.A., Andreeva A. v., Reverdatto S.V., Chakhmakhcheva O.G.; "Photosystem II of rye. 
Nucleotide sequence of the psbB, psbC, psbE, psbF, psbH genes of rye and chloroplast DNA 
regions adjacent to them."; Bioorg. Khim. 17:1369-1385(1991). 

[3] SEQUENCE OF 411-420. Hinz U.G.; "Isolation of the photosystem II reaction center 
complex from barley. Characterization by cicular dichroism spectroscopy and amino acid 
sequencing."; Carlsberg Res. Commun. 50:285-298(1985). 

1000. QRPTase. Quinolinate phosphoribosyl transferase. 

Quinolinate phosphoribosyl transferase (QPRTase) or nicotinate-nucleotide 

pyrophosphorylase EC:2.4.2.19 is involved in the de novo synthesis of NAD in both 

prokaryotes and eukaryotes. It catalyses the reaction of quinolinic acid with 5- 

phosphoribosyl-1 -pyrophosphate (PRPP) in the presence of Mg2+ to give rise to nicotinic 

acid mononucleotide (NaMN), pyrophosphate and carbon dioxide [1,2], Number of members: 

26. 

[IJMedline: 97169443. A new function for a common fold: the crystal structure of quinolinic 
acid phosphoribosyltransferase. Eads JC, Ozturk D, Wexler TB, Grubmeyer C, Sacchettini 
JC; Structure 1997;5:47-58. 

[2]Medline: 96139309. The sequencing expression, purification, and steady-state kinetic 
analysis of quinolinate phosphoribosyl transferase from Escherichia coli. Bhatia R, Calvo 
KC; Arch Biochem Biophys 1996;325:270-278. 

1001. R3H domain 

The name of the R3H domain comes from the characteristic spacing of the most conserved 
arginine and histidine residues. The function of the domain is predicted to be binding 
ssDNA. Number of members: 28 
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[l]Medline: 99003905 The R3H motif: a domain that binds single-stranded nucleic acids. 
Grishin NV; Trends Biochem Sci 1998;23:329-330. 

1002. recF protein signatures (RecF) 

The prokaryotic protein recF [1,2] is a single-stranded DNA-binding protein which also 
probably binds ATP. RecF is involved in DNA metabolism; it is required for recombinational 
DNA repair and for induction of the SOS response. RecF is a protein of about 350 to 370 
amino acid residues; there is a conserved ATP-binding site motif A' (P-loop) in the N- 
terminal section of the protein as well as two other conserved regions, one located in the 
central section, and the other in the C-terminal section. Signature patterns were derived from 
these two regions. 

Consensus pattern [LIVM]-x(4)-[LIF]-x(6)-[LIF]-[LVF]-x-[GE]-[GSTAD]-[PA]- x(2)-R-R- 
x-[FYW]-[LIVMF]-D Sequences known to belong to this class detected by the pattern ALL. 

Consensus pattern[LIVMFY](2)-x-D-x(2,3)-[SA]-[EH]-L-D-x(2)-[KRH]-x(3)-L Sequences 
known to belong to this class detected by the patternALL, except for T. palidum recF. 

[ 1] Sandler S.J., Chackerian B., Li J.T., Clark A.J. Nucleic Acids Res. 20:839-845(1992). 
[ 2] Alonso J.C., Fisher L.M.; Mol. Gen. Genet. 246:680-686(1995). 

1003. RibD C-terminal domain (RibD_C) 

The function of this domain is not known, but it is thought to be involved in riboflavin 

biosynthesis. This domain is found in the C terminus of RibD/RibG Swiss:P25539, in 

combination with dCMP_cyt_deam, as well as in isolation in some archaebacterial proteins 

Swiss:P95872. 

Number of members: 21 

1004. Ribosomal protein L16 signatures (Ribosomal_L16) 
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Ribosomal protein L16 is one of the proteins from the large ribosomal subunit. In Escherichia 
coli, L16 is known to bind directly the 23S rRNA and to be located at the A site of the 
peptidyltransf erase center. It belongs to a family of ribosomal proteins which, on the basis of 
sequence similarities [1], groups: 

- Eubacterial L16. 

- Algal and plant chloroplast LI 6. 

- Cyanelle L16. 

- Plant mitochondrial LI 6. 

L16 is a protein of 133 to 185 amino-acid residues. As signature patterns, we 
selected two conserved regions in the central section of these proteins. 

Consensus pattern [KR](2)-x-[GSAC]-[KRQVA]-[LIVM]-W-[LIVM]-[KR]-[LIVM]- 
[LFY]-[AP] Sequences known to belong to this class detected by the pattern ALL. 

Consensus patternR-M-G-x-[GR]-K-G-x(4)-[FWKR] Sequences known to belong to this 
class detected by the patternALL. 

[ 1] Otaka E., Hashimoto T., Mizuta K., Suzuki K. Protein Seq. Data Anal. 5:301-313(1993). 
1005. Ribosomal protein L32e signature (Ribosomal_L32E) 

A number of eukaryotic and archaebacterial ribosomal proteins can be grouped on the basis 
of sequence similarities. One of these families consists of: 

- Mammalian L32 [1]. 
-Drosophila RP49 [2]. 

- Trichoderma harzianum L32 [3]. 

- Yeast L32e (YBL092w). 

- Archaebacterial L32e [4]. 

These proteins have 135 to 240 amino-acid residues. As a signature pattern, a stretch of about 
20 residues located in the N-terminal part of these proteins was seleced. 

Consensus patternF-x-R-x(4)-[KR]-x(2)-[KR]-[LIVMF]-x(3,5)-W-R-[KR]-x(2)-G Sequences 
known to belong to this class detected by the pattern ALL. 
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[ 1] Jacks CM., Powaser C.B., Hackett P.B. Gene 74:565-570(1988). 
[ 2] Aguade M. Mol. Biol. Evol. 5:433-441(1988). 

[ 3] Lora J.M., Garcia I., Benitez T., Llobell A., Pintor-Toro J.A. Nucleic Acids Res. 
21:3319-3319(1993). 

[ 4] Arndt E., Scholzen T., Kroemer W., Hatakeyama T., Kimura M. Biochimie 73:657- 
668(1991). 

1006. (Ribosomal_S3) Ribosomal protein S3 signature 

PROSITE: PDOC00474. PROSITE cross-reference(s) PS00548; RIB0S0MAL_S3 

Ribosomal protein S3 is one of the proteins from the small ribosomal subunit. 
In Escherichia coli, S3 is known to be involved in the binding of initiator Met-tRNA. It 
belongs to a family of ribosomal proteins which, on the basis of sequence similarities [1], 
groups: 

-Eubacterial S3. 

-Algal and plant chloroplast S3. 

-Cyanelle S3. 

-Archaebacterial S3. 

-Plant mitochondrial S3. 

-Vertebrate S3. 

-Insect S3. 

-Caenorhabditis elegans S3 (C23G10.3). 
-Yeast S3 (Rpl3). 

S3 is a protein of 209 to 559 amino-acid residues. A conserved region located in the C- 
terminal section was selected as a signature pattern. 

Consensus pattern[GSTA]-[KR]-x(6)-G-x-[LIVMT]-x(2)-[NQSCH]-x(l,3)-[LIVFCA]-x(3)- 
[LIV]-[DENQ]-x(7)-[LMT]-x(2)-G-x(2)-[GS]. Sequences known to belong to this class 
detected by the patternALL, except for some mitochondrial S3. 

[l]Otaka E., Hashimoto T., Mizuta K. Protein Seq. Data Anal. 5:285-300(1993). 



1007. RimM - RimM 



Reference No. 2750-942P 



803 

The RimM protein is essential for efficient processing of 16S rRNA [1]. The RimM protein 
was shown to have affinity for free ribosomal 308 subunits but not for 30S subunits in the 
70S ribosomes [1]. Number of members: 14. 

[l]Medline: 98083058. RimM and RbfA are essential for efficient processing of 16S rRNA in 
Escherichia coli. Bylund GO, Wipemo LC, Lundberg LA, Wikstrom PM; J Bacteriol 
1998;180:73-82. 

1008. RNA_pol_A - RNA polymerase alpha subunit 

-!- RNA polymerases catalyse the DNA dependent polymerisation of RNA. Prokaryotes 
contain a single RNA polymerase compared to three in eukaryotes (not including 
mitochondrial and chloroplast polymerases). 

-!- Members of this family include: A subunit from eukaryotes, gamma subunit from 
cyanobacteria, beta' subunit from eubacteria, A subunit from archaebacteria, B" from 
chloroplasts. Number of members: 139. 

[l]Medline: 97066998. Structural modules of the large subunits of RNA polymerase. 
Introducing archaebacterial and chloroplast split sites in the beta and beta' subunits of 
Escherichia coli RNA polymerase. Severinov K, Mustaev A, Kukarin A, Muzzin O, Bass I, 
Darst SA, Goldfarb A; J Biol Chem 1996;271:27969-27974. 

1009. RuBisCOJarge - Ribulose bisphosphate carboxylase large chain active site 
PROSITE: PDOC00142; PROSITE cross-reference(s) PS00157; RUBISCO_LARGE 

Ribulose bisphosphate carboxylase (EC 4.1.1.39) (RuBisCO) [1,2] catalyzes the 
initial step in Calvin's reductive pentose phosphate cycle in plants as well as purple and green 
bacteria. It consists of a large catalytic unit and a small subunit of undetermined function. In 
plants, the large subunit is coded by the chloroplastic genome while the small subunit is 
encoded in the nuclear genome. Molecular activation of RuBisCO by C02 involves the 
formation of a carbamate with the epsilon-amino group of a conserved lysine residue. This 
carbamate is stabilized by a magnesium ion. One of the ligands of the magnesium ion is an 
aspartic acid residue close to the active site lysine [3]. A pattern was developed which 
includes both the active site residue and the metal ligand, and which is specific to RuBisCO 
large chains. 
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Consensus pattemG-x-[DN]-F-x-K-x-D-E [K is the active site residue] [The second D is a 
magnesium ligand]. Sequences known to belong to this class detected by the patternALL, 
except for Cheilopleuria biscuspis RuBisCO. 

[l]Miziorko H.M., Lorimer G.H. Annu. Rev. Biochem. 52:507-535(1983). 
[2]Akazawa T., Takabe T., Kobayashi H. Trends Biochem. Sci. 9:380-383(1984). 
[3]Andersson I., Knight S., Schneider G., Lindqvist Y., Lundqvist T., Branden C.-I., Lorimer 
G.H. Nature 337:229-234(1989). 

1010. Rve - Integrase core domain 

Integrase mediates integration of a DNA copy of the viral genome into the host chromosome. 
Integrase is composed of three domains. The amino-terminal domain is a zinc binding 
domain Integrase_Zn. This domain is the central catalytic domain. The carboxyl terminal 
domain that is a non-specific DNA binding domain integrase. The catalytic domain acts as an 
endonuclease when two nucleotides are removed from the 3' ends of the blunt-ended viral 
DNA made by reverse transcription. This domain also catalyses the DNA strand transfer 
reaction of the 3' ends of the viral DNA to the 5' ends of the integration site [1]. Number of 
members: 694. 

[IJMedline: 95099322. Crystal structure of the catalytic domain of HIV-1 integrase: 
similarity to other polynucleotidyl transferases. Dyda F, Hickman AB, Jenkins TM, 
Engelman A, Craigie R, Davies DR; Science 1994;266:1981-1986. 

1011. (SBP_bac_3) Bacterial extracellular solute-binding proteins, family 3 signature 
PROSITE: PDOC00798. PROSITE cross-reference(s) PS01039; SBP_BACTERIAL_3 

Bacterial high affinity transport systems are involved in active transport of solutes 
across the cytoplasmic membrane. The protein components of these traffic systems include 
one or two transmembrane protein components, one or two membrane-associated ATP- 
binding proteins (ABC transporters; see <PDOC00185>) and a high affinity periplasmic 
solute-binding protein. The later are thought to bind the substrate in the vicinity of the inner 
membrane, and to transfer it to a complex of inner membrane proteins for concentration into 
the cytoplasm. 
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In gram-positive bacteria which are surrounded by a single membrane and have 
therefore no periplasmic region the equivalent proteins are bound to the membrane via an N- 
terminal lipid anchor. These homolog proteins do not play an integral role in the transport 
process per se, but probably serve as receptors to trigger or initiate translocation of the solute 
throught the membrane by binding to external sites of the integral membrane proteins of the 
efflux system. 

In addition at least some solute-binding proteins function in the initiation of sensory 
transduction pathways. 

On the basis of sequence similarities, the vast majority of these solute-binding 
proteins can be grouped [1] into eight families of clusters, which generally correlate with the 
nature of the solute bound. 

Family 3 groups together specific amino acids and opine-binding periplasmic proteins 
and a periplasmic homolog with catalytic activity: 

-Histidine-binding protein (gene hisJ) of Escherichia coli and related bacteria. An 
homologous lipoprotein exists in Neisseria gonorrhoeae. 

-Lysine/arginine/ornithine-binding proteins (LAO) (gene argT) of Escherichia coli and 
related bacteria are involved in the same transport system than hisJ. Both solute-binding 
proteins interact with a common membrane-bound receptor hisP of the binding protein 
dependent transport system HisQMP. 

-Glutamine-binding proteins (gene glnH) of Escherichia coli and Bacillus 
stearothermophilus. 

-Glutamate-binding protein (gene gluB) of Corynebacterium glutamicum. 
-Arginine-binding proteins art! and artJ of Escherichia coli. 
-Nopaline-binding protein (gene nocT) from Agrobacterium tumefaciens. 
-Octopine -binding protein (gene occT) from Agrobacterium tumefaciens. 
-Major cell-binding factor (CBFl) (gene: peblA) from Campylobacter jejuni. 
-Bacteroides nodosus protein aabA. 

-Cyclohexadienyl/arogenate dehydratase of Pseudomonas aeruginosa, a periplasmic 

enzyme which forms an alternative pathway for phenylalanine biosynthesis. 

-Escherichia coli protein fliY. 

-Vibrio harveyi protein patH. 

-Escherichia coli hypothetical protein ydhW. 

-Bacillus subtilis hypothetical protein yckB. 
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-Bacillus subtilis hypothetical protein yckK. 

The signature pattern is located near the N-terminus of the mature proteins. 

Consensus patternG-[FYIL]-[DE]-[LIVMT]-[DE]-[LIVMF]-x(3)-[LIVMA]-[VAGC]-x(2)- 

[LIVMAGN] 

Sequences known to belong to this class detected by the patternALL. 
[l]Tam R., Saier M.H. Jr. Microbiol. Rev. 57:320-346(1993). 

1012. Sec7 - Sec? domain 

The SecV domain is a guanine-nucleotide-exchange-factor (GEF)for the arf family [2]. 
Number of members: 32. 

[l]Medline: 98169075. Structure of the Sec7 domain of the Arf exchange factor. ARNO. 
Cherfils J, Menetrey J, Mathieu M, Le Bras G, Robineau S, Beraud-Dufour S, Antonny B, 
Chardin P; Nature 1998;392:101-105. 

[2]Medline: 97100951. A human exchange factor for ARF contains Sec7- and pleckstrin- 
homology domains. Chardin P, Paris S, Antonny B, Robineau S, Beraud-Dufour S, Jackson 
CL, Chabre M. Nature 1996;384:481-484. 

1013. SecA_protein. SecA protein, amino terminal region 

SecA protein binds to the plasma membrane where it interacts with proOmpA to support 
translocation of proOmpA through the membrane. SecA protein achieves this translocation, 
in association with SecY protein, in an ATP dependent manner. SecA possesses the ATPase 
activity. The carboxyl terminus has similarity with the helicase carboxyl terminus. See 
Ribosomal_L5. Number of members: 45. 

[l]Medline: 98309858. Amino-terminal region of SecA is involved in the function of SecG 
for protein translocation into Escherichia coli membrane vesicles. Mori H, Sugiyama H, 
Yamanaka M, Sato K, Tagaya M, Mizushima S; J Biochem (Tokyo) 1998;124:122-129. 
[2]Medline: 89251629. SecA protein hydrolyzes ATP and is an essential component of the 
protein translocation ATPase of Escherichia coli. Lill R, Cunningham K, Brundage LA, Ito 
K, Oliver D, Wickner W; EMBO J 1989;8:961-966. 
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1014. Seedstore_2S - 2S seed storage family 

Members of this family are composed of two chains (both included in the alignment), these 
are co-translated and later cleaved. The two chains are disulphide linked together. Number of 
members: 27. 

[l]Medline: 97121264. IH NMR assignment and global fold of napin Bnlb, a representative 
2S albumin seed protein. Rico M, Bruix M, Gonzalez C, Monsalve RI, Rodriguez R; 
Biochemistry 1996;35:15672-15682. 

1015. Smr - Smr domain 

This family includes the Smr (Small MutS Related) proteins, and the C-terminal region of the 
MutS2 protein. It has been suggested that this domain interacts with the MutSl Swiss:P23909 
protein in the case of Smr proteins and with the N-terminal MutS related region of MutS2 
Swiss:P94545 [1]. Number of members: 14. 

[l]Medline: 10431172. Smr: a bacterial and eukaryotic homologue of the C-terminal region 
of the MutS2 family. Moreira D, Philippe H; Trends Biochem Sci 1999;24:298-300. 

1016. (SSF) Sodium:solute symporter family signatures and profile 

PROSITE: PDOC00429. PROSITE cross-reference(s)PS00456; NA_SOLUT_SYMP_l 
PS00457; NA_S0LUT_SYMP_2 PS50283; NA_S0LUTE_SYMP_3 

It has been shown [1,2] that integral membrane proteins that mediate the intake of a 
wide variety of molecules with the concomitant uptake of sodium ions (sodium symporters) 
can be grouped, on the basis of sequence and functional similarities into a number of distinct 
families. One of these families is known as the sodium: solute symporter family (SSF) and 
currently consists of the following proteins: 
-Mammalian Na+/glucose co-transporter. 
-Mammalian Na+/myo-inositol co-transporter. 
-Mammalian Na+/nucleoside co -transporter. 
-Mammalian Na+/neutral amino acid co-transporter. 
-Escherichia coli Na+/proline symporter (gene putP). 
-Escherichia coli Na-i-/pantothenate symporter (gene panF). 
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-Escherichia coli hypothetical protein yidK. 
-Escherichia coli hypothetical protein yjcG. 
-Bacillus subtilis hypothetical protein ywcA (ipa-31R). 

These integral membrane proteins are predicted to comprise at least ten membrane 
spanning domains. Two conserved regions were selected as signature patterns; the first one is 
located in the fourth transmembrane region and the second one in a loop between two 
transmembrane regions in the C-terminal part of these proteins. 

Consensus pattern[GS]-x(2)-[LIY]-x(3)-[LIVMFYWSTAG](10)-[LIY]-[TAV]-x(2)-G-G- 
[LMF]-x-[SAP]. Sequences known to belong to this class detected by the patternALL. 
Consensus pattem[GAST]-[LIVM]-x(3)-[KR]-x(4)-G-A-x(2)-[GAS]-[LIVMGS]-[LIVMW]- 
[LIVMGAT]-G-x-[LIVMGA] Sequences known to belong to this class detected by the 
patternALL, except for E.coli yidK. 

Note this documentation entry is linked to both a signature pattern and a profile. As the 
profile is much more sensitive than the pattern, you should use it if you have access to the 
necessary software tools to do so. 

[l]Reizer J., Reizer A, Saier M.H. Jr. Res. Microbiol. 141:1069-1072(1991). 
[2]Reizer J., Reizer A, Saier M.H. Jr. Biochim. Biophys. Acta 1197:133-136(1994). 

1017. SurE - Survival protein SurE 

E. coli cells with the surE gene disrupted are found to survive poorly in stationary phase [1]. 
It is suggested that SurE may be involved in stress response. Yeast also contains a member of 
the family Swiss:P38254. Swiss:P30887 can complement a mutation in acid phosphatase, 
suggesting that members of this family could be phosphatases. Number of members: 17. 

[IJMedline: 95014035. A new gene involved in stationary-phase survival located at 59 
minutes on the Escherichia coli chromosome. Li C, Ichikawa JK, Ravetto JJ, Kuo HC, Fu JC, 
Clarke S; J Bacteriol 1994;176:6015-6022. 

[2]Medline: 93046805. Complementation of Saccharomyces cerevisiae acid phosphatase 
mutation by a genomic sequence from the yeast Yarrowia lipolytica identifies a new 
phosphatase. Treton BY, Le Dall MT, Gaillardin CM; Curr Genet 1992;22:345-355. 
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1018. Synuclein - Synuclein 

There are three types of synucleins in humans, these are called alpha, beta and gamma. 
Alpha synuclein has been found mutated in families with autosomal dominant Parkinson's 
disease. A peptide of alpha synuclein has also been found in amyloid plaques in Alzheimer's 
patients. Number of members: 12. 

[l]Medline: 98424410. The synuclein family. Lavedan C; Genome Res 1998;8:871-880. 

1019. (T-box) T-box domain signatures 

PROSITE: PDOC00972. PROSITE cross-reference(s) PS01283; TBOX_l PS01264; 
TB0X_2 

A number of eukaryotic DNA-binding proteins contain a domain of about 170 to 190 
amino acids known as the T-box domain [1,2,3] and which probably binds DNA. The T-box 
has first been found in the mice T locus (Brachyury) protein, a transcription factor involved 
in mesoderm differentiation. It has since been found in the following proteins: 
-Vertebrate and invertebrate homologs of the T protein. 
-Mammalian proteins TBXl to TBX6. 

-Mammalian protein TBRl which is expressed specifically in brain. 
-Xenopus laevis eomesodermin (eomes). 

-Xenopus laevis Vegt (or Antipodean), a transcription factor that activates the expression of 
wnt-8, eomes and Brachyury. 
-Chicken TbxT. 

-Drosophila protein optomotor-blind (omb). 

-Drosophila protein brachyenteron (byn) (also known as Trg), which is 
required for the specification of the hindgut and anal pads. 
-Drosophila protein H15. 
-Caenorhabditis elegans protein tbx-12. 

-Caenorhabditis elegans hypothetical proteins F21H11.3, F40H6.4, T07C4.2, T07C4.6 and 
ZK177.10. 

Two conserved regions were selected as signature patterns for the T-domain. The first region 
corresponds to the N-terminal of the domain and the second one to the central part. 
Consensus patternL-W-x(2)-[FC]-x(3,4)-[NT]-E-M-[LIV](2)-T-x(2)-G-[RG]-[KRQ] 
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Sequences known to belong to this class detected by the patternALL, except for C.elegans 
ZK177.10. 

Consensus pattern[LIVMYW]-H-[PADH]-[DEN]-[GS]-x(3)-G-x(2)-W-M-x(3)-[IVA]-x- F 
Sequences known to belong to this class detected by the patternALL, except for C.elegans 
tbx-12, ZKl 77.10 and Drosophila H15. 

[l]Bollag R.J., Siegfried Z., Cebra-Thomas J.A., Garvey N., Davison E.M., Silver L.M. Nat. 
Genet. 7:383-389(1994). 

[2] Agulnik S.I., Garvey N., Hancock S., Ruvinsky I., Chapman D.L., Agulnik I., Bollag R.J., 
Papaioannou V.E., Silver L.M. Genetics 144:249-254(1996). 
[3]Papaioannou V.E. Trends Genet. 13:212-213(1997). 

1020. Toprim - Toprim domain 

This is a conserved region from DNA primase. This corresponds to the Toprim domain 
common to DnaG primases, topoisomerases, OLD family nucleases and RecR proteins [1]. 
Both DnaG motifs IV and V are present in the alignment, the DxD (V) motif may be involved 
in Mg2+ binding and mutations to the conserved glutamate (IV) completely abolish DnaG 
type primase activity [1]. DNA primase EC:2.7.7.6 is a nucleotidyltransferase it synthesizes 
the oligoribonucleotide primers required for DNA replication on the lagging strand of the 
replication fork; it can also prime the leading stand and has been implicated in cell division 
[2]. Number of members: 133. 

[l]Medline: 98391745. Toprim~a conserved catalytic domain in type lA and II 
topoisomerases, DnaG-type primases, OLD family nucleases and RecR proteins. Aravind L, 
Leipe DD, Koonin EV; Nucleic Acids Res 1998;26:4205-4213. 

[2]Medline: 97368180. Cloning and analysis of the dnaG gene encoding Pseudomonas putida 
DNA primase. Szafranski P, Smith CL, Cantor CR; Biochim Biophys Acta 1997;1352:243- 
248. 

[3]Medline: 94124015. The Haemophilus influenzae dnaG sequence and conserved bacterial 
primase motifs. Versalovic J, Lupski JR; Gene 1993;136:281-286. 



1021. TraB - TraB family 
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pADl is a hemolysin/bacteriocin plasmid originally identified in Enterococcus faecalis DS16. 
It encodes a mating response to a peptide sex pheromone, cADl, secreted by recipient 
bacteria. Once the plasmid pADl is acquired, production of the pheromone ceases-a trait 
related in part to a determinant designated traB. However a related protein is found in C. 
5 elegans Swiss:Q94217, suggesting that members of the TraB family have some more general 
function. Number of members: 12. 

[l]Medline: 94302142. Characterization of the determinant (traB) encoding sex pheromone 
shutdown by the hemolysin/bacteriocin plasmid pADl in Enterococcus faecalis. An FY, 
10 Clewell DB; Plasmid 1994;31:215-221. 

1022. (Transpo_mutator) Transposases, Mutator family, signature 
PROSITE: PDOC00770. PROSITE cross-reference(s) PS01007; 
TRANSPOSASE_MUTATOR 
1 5 Autonomous mobile genetic elements such as transposon or insertion sequences (IS) 

encode an enzyme, called transposase, required for excising and inserting the mobile element. 
On the basis of sequence similarities, transposases can be grouped into various families. One 
of these families has been shown [1,2,3,E1] to consist of transposases from the following 
elements: 

2 0 -Mutator from Maize. 

-Isl201 from Lactobacillus helveticus. 
-Is905 from Lactococcus lactis. 
-IslOSl from Mycobacterium bovis. 
-Is6120 from Mycobacterium smegmatis. 
25 -Is406 from Pseudomonas cepacia. 
-IsRm3 from Rhizobium meliloti. 
-IsRm5 from Rhizobium meliloti. 
-Is256 from Staphylococcus aureus. 
-IsT2 from Thiobacillus ferrooxidans. 

3 0 The maize Mutator transposase (MudrA) is a protein of 823 amino acids; the bacterial 

transposases listed above are proteins of 300 to 420 amino acids. These proteins contain a 
conserved domain of about 130 residues; a signature pattern was derived from the most 
conserved part of this domain. 
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Consensus patternD-x(3)-G-[LIVMF]-x(6)-[STAV]-[LIVMFYW]-[PT]-x-[STAV]-x(2)- 
[QR]-x-C-x(2)-H. Sequences known to belong to this class detected by the patternALL. 

5 [IjEisen J.A., Benito M.-L, Walbot V. Nucleic Acids Res. 22:2634-2636(1994). 
[2]Guilhot C, Gicquel B., Davies J., Martin C. Mol. Microbiol. 6:107-113(1992). 
[3]Wood M.S., Byrne A., Lessie T.G. Gene 105:101-105(1991). 

1023. Transposase_8 - Transposase 

1 0 Transposase proteins are necessary for efficient DNA transposition. This family 

consists of various E. coli insertion elements and other bacterial transposases some of which 
are members of the IS3 family. Number of members: 58. 

[IJMedline: 97324595. Genetic organization and transposition properties of IS511. D. A. 

1 5 MuUin, D. L. Zies, A. H. Mullin, N. Caballera & B. Ely; Mol Gen Genet 1997;254:456-463. 
[2]Medline: 97128810. The use of an improved transposon mutagenesis system for DNA 
sequencing leads to the characterization of a new insertion sequence of Streptomyces lividans 
66. J. Fischer, H. Maier, P. Viell & J. Altenbuchner; Gene 1996;180:81-89. 
[3]Medline: 97074647. Identification and nucleotide sequence of Rhizobium meliloti 

2 0 insertion sequence ISRm6, a small transposable element that belongs to the IS3 family. S. 
Zekri & N. Toro; Gene 1996;175:43-48. 

1024. tRNA_int_endo - tRNA intron endonuclease 

Members of this family cleave pre tRNA at the 5' and 3' splice sites to release the intron 
2 5 EC:3. 1.27.9. Number of members: 8. 

[l]Medline: 97344075. Properties of H. volcanii tRNA intron endonuclease reveal a 
relationship between the archaeal and eucaryal tRNA intron processing systems. Kleman- 
Leyer K, Armbruster DW, Daniels CJ; Cell 1997;89:839-847. 

30 

1025. Urease - Urease signatures 

PROSITE: PDOC00133PROSITE cross-reference(s) PS01120; UREASE_1 PS00145; 
UREASE_2 
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Urease (EC 3.5.1.5) is a nickel-binding enzyme that catalyzes the hydrolysis of urea 
to carbon dioxide and ammonia [1]. Historically, it was the first enzyme to be crystallized (in 
1926). It is mainly found in plant seeds, microorganisms and invertebrates. In plants, urease 
is a hexamer of identical chains. In bacteria [2], it consists of either two or three different 
5 subunits (alpha, beta and gamma). 

Urease binds two nickel ions per subunit; four histidine, an aspartate and a 
carbamated-lysine serve as ligands to these metals; an additional histidine is involved in the 
catalytic mechanism [3]. 

As signatures for this enzyme, a region that contains two histidine that bind one of the 
1 0 nickel ions and the region of the active site histidine was selected. 

Consensus pattern T-[AY]-[GA]-[GAT]-[LIVM]-D-x-H-[LIVM]-H-x(3)-P [The two H's bind 
nickel]. Sequences known to belong to this class detected by the patternALL. 
Consensus pattern[LIVM](2)-[CT]-H-[HN]-L-x(3)-[LIVM]-x(2)-D-[LIVM]-x-F-A [H is the 
1 5 active site residue]. Sequences known to belong to this class detected by the patternALL. 

[l]Takishima K., Suga T., Mamiya G. Eur. J. Biochem. 175:151-165(1988). 

[2]Mobley H.L.T., Husinger R.P. Microbiol. Rev. 53:85-108(1989). 

[3]Jabri E., Carr M.B., Hausinger R.P., Karplus P.A. Science 268:998-1004(1995). 

20 

1026. Urease_beta - Urease beta subunit. 

This subunit is known as alpha in Heliobacter. Number of members: 35. 

[l]Medline: 95273988. The crystal structure of urease from Klebsiella aerogenes. Jabri E, 
25 Carr MB, Hausinger RP, Karplus PA; Science 1995;268:998-1004. 

1027. UvrD-helicase - UvrD/REP helicase 

The Rep family helicases are composed of four structural domains. The Rep family function 
as dimers. REP helicases catalyse ATP dependent unwinding of double stranded DNA to 
3 0 single stranded DNA. Swiss:P23478, Swiss:P08394 have large insertions near to the carboxy- 
terminus relative to other members of the family. Number of members: 52. 
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[1] Medline: 97433075. Major domain swiveling revealed by the crystal structures of 
complexes of E. coli Rep helicase bound to single-stranded DNA and ADP. Korolev S, Hsieh 
J, Gauss GH, Lohman TM, Waksman G; Cell 1997;90:635-647. 

5 1028. V-type ATPase 1 16kDa subunit family (V_ATPase_sub_a) 

This family consists of the 116kDa V-type ATPase (vacuolar (H+)-ATPases) subunits, as 
well as V-type ATP synthase subunit i. The V-type ATPases family are proton pumps that 
acidify intracellular compartments in eukaryotic cells for example yeast central vacuoles, 

1 0 clathrin-coated and synaptic vesicles. They have important roles in membrane trafficking 
processes [1]. The 116kDa subunit (subunit a) in the V-type ATPase is part of the VO 
functional domain responsible for proton transport. The a subunit is a transmembrane 
glycoprotein with multiple putative transmembrane helices t has a hydrophilic amino 
terminal and a hydrophobic carboxy terminal [1,2]. It has roles in proton transport and 

1 5 assembly of the V-type ATPase complex [1,2]. This subunit is encoded by two homologous 
gene in yeast VPHl and STVl [2]. 
Number of members: 27 

[1] Forgac M; Medline: 99240666 "Structure and properties of the vacuolar (H+)-ATPases." 
2 0 J Biol Chem 1999;274:12951-12954. 

[2] Forgac M; Medline: 99270697 "Structure and properties of the clathrin-coated vesicle and 
yeast vacuolar V-ATPases." J Bioenerg Biomembr 1999;31:57-65. 

1029. Viral (Superfamily 1) RNA helicase (Viral_helicasel) 
2 5 Number of members: 260 

[1] Koonin EV, Dolja VV; Medline: 94094568 "Evolution and taxonomy of positive-strand 
RNA viruses: implications of comparative analysis of amino acid sequences." Crit Rev 
Biochem Mol Biol 1993;28:375-430. 

30 

1030. Vesicular monoamine transporter (VMAT) 



Reference No. 2750-942P 

815 

This family consists of various vesicular amine transporters with 12 transmembrane helices. 
These included vesicular acetylcholine transporters (VAChT) [3], and vesicular monoamine 
transporters (VMATs) [1,2] isoforms 1 adrenal and 2 brain (VMATl and VMAT2). 

5 These proteins transport biogenic amines into synaptic vesicles or chromaffin granules [4]. 
VMATs pack monoamine neurotransmitters into secretary vesicles for regulated exocytotic 
release, they also protect against the parkinsonian neurotoxins MPP+ by transporting it into 
vesicles preventing it from acting on mitochondria [1]. 

1 0 Also in the family is C. elegans UNC-17 a putative vesicular acetylcholine transporter 
mutations in UNC-17 cause impaired neuromuscular function, giving rise to jerky or 
uncoordinated movement, [4] . 
Number of members: 15 

15 [1] Krantz DE, Peter D, Liu Y, Edwards RH; Medline: 97197857 "Phosphorylation of a 
vesicular monoamine transporter by casein kinase II." J Biol Chem 1997;272:6752-6759. 
[2] Erickson JD, Varoqui H, Schafer MK, Modi W, Diebler MF, Weihe E, Rand J, Eiden LE, 
Bonner TI, Usdin TB; Medline: 94350930 "Functional identification of a vesicular 
acetylcholine transporter and its expression from a 'cholinergic' gene locus." J Biol Chem 

2 0 1994;269:21929-21932. 

[3] Erickson JD, Schafer MK, Bonner TI, Eiden LE, Weihe E; Medline: 96209876 "Distinct 
pharmacological properties and distribution in neurons and endocrine cells of two isoforms of 
the human vesicular monoamine transporter." Proc Natl Acad Sci U S A 1996;93:5166-5171. 
[4] Alfonso A, Grundahl K, Duerr JS, Han HP, Rand JB; Medline: 3342494 "The 

25 Caenorhabditis elegans unc-17 gene: a putative vesicular acetylcholine transporter." Science 
1993;261:617-619. 

1031. WW/rsp5AVWP domain signature and profile. Cross-reference(s): PS01159; 
WW_DOMAIN_l; PS50020; WW_DOMAIN_2 

30 

The WW domain [1-4,E1] (also known as rsp5 or WWP) has been originally discovered as a 
short conserved region in a number of unrelated proteins, among them dystrophin, the gene 
responsible for Duchenne muscular dystrophy. The domain, which spans about 35 residues. 
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is repeated up to 4 times in some proteins. It has been shown [5] to bind proteins with 
particular proline-motifs, [AP]-P-P-[AP]-Y, and thus resembles somewhat SH3 domains. It 
appears to contain beta-strands grouped around four conserved aromatic positions; generally 
Trp. The name WW or WWP derives from the presence of these Trp as well as that of a 
5 conserved Pro. It is frequently associated with other domains typical for proteins in signal 
transduction processes. 

Proteins containing the WW domain are listed below. 

10 —Dystrophin, a multidomain cytoskeletal protein. Its longest alternatively spliced form 
consists of an N-terminal actin-binding domain, followed by 24 spectrin-like repeats, a 
cysteine-rich calcium-binding domain and a C-terminal globular domain. Dystrophin form 
tetramers and is thought to have multiple functions including involvement in membrane 
stability, transduction of contractile forces to the extracellular environment and organization 

15 of membrane specialization. Mutations in the dystrophin gene lead to muscular dystrophy of 
Duchenne or Becker type. Dystrophin contains one WW domain C-terminal of the spectrin- 
repeats. 

— Utrophin, a dystrophin-like protein of unknown function. 

-Vertebrate YAP protein is a substrate of an unknown serine kinase. It binds to the SH3 
2 0 domain of the Yes oncoprotein via a proline-rich region. This protein appears in alternatively 
spliced isoforms, containing either one or two WW domains [6]. 
-Mouse NEDD-4 plays a role in the embryonic development and differentiation of the 
central nervous system. It contains 3 WW modules followed by a HECT domain. The 
human ortholog contains 4 WW domains, but the third WW domain is probably spliced 

2 5 resulting in an alternate NEDD-4 protein with only 3 WW modules [3]. 

-Yeast RSP5 is similar to NEDD-4 in its molecular organization. It contains an N-terminal 
C2 domain (see <PDOC00380>), followed by a histidine-rich region, 3 WW domains and a 
HECT domain. 

-Rat FE65, a transcription-factor activator expressed preferentially in liver. The activator 

3 0 domain is located within the N-terminal 232 residues of FE65, which also contain the WW 

domain. 
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--Yeast ESSl/PTFl, a putative peptidyl prolyl cis-trans isomerase from family ppiC (see 
<PDOC00840>). A related protein, dodo (gene dod) exists in Drosophila and in mammals 
(gene PINl). 

-Tobacco DBIO protein. The WW domain is located N-terminal to the region with 
5 similarity to ATP-dependent RNA helicases. 

-IQGAP, a human GTPase activating protein acting on ras. It contains an N-terminal 
domain similar to fly muscle mp20 protein and a C-terminal ras GTPase activator domain. 
-Yeast pre-mRNA processing protein PRP40, Caenorhabditis elegans ZK1098.1 and fission 
yeast SpAC13C5.02 are related proteins with similarity to MY02-type myosin, each 
1 0 containing two WW-domains at the N-terminus. 

-Caenorhabditis elegans hypothetical protein C38D4.5, which contains one WW module, a 
PH domain (see <PDOC50003>) and a C-terminal phosphatidylinositol 3-kinase domain. 
-Yeast hypothetical protein YFLOlOc. 

1 5 For the sensitive detection of WW domains, a profile was developed which spans the whole 
homology region as well as a pattern. 

Description of pattern(s) and/or profile(s): 

2 0 Consensus patternW-x(9,ll)-[VFY]-[FYW]-x(6,7)-[GSTNE]-[GSTQCR]-[FYW]-x(2)-P. 

[ 1] Bork P., Sudol M. Trends Biochem. Sci. 19:531-533(1994). 

[ 2] Andre B., Springael J.Y. Biochem. Biophys. Res. Commun. 205:1201-1205(1994). 
[ 3] Hofmann K.O., Bucher P. FEBS Lett. 358:153-157(1995). 
25 [4] Sudol M., Chen H.I., Bougeret C, Einbond A., Bork P. FEBS Lett. 369:67-71(1995). 
[ 5] Chen H.L, Sudol M. Proc. Natl. Acad. Sci. U.S.A. 92:7819-7823(1995). 
[ 6] Sudol M., Bork P., Einbond A., Kastury K., Druck T., Negrini M., Huebner 
K., Lehman D. J. Biol. Chem. 270:14733-14741(1995). 

3 0 1032. XPA protein signatures, cross-reference(s): XPA_1 PROSITE PS00752; 

PS00753;XPA_2. 

Xeroderma pigmentosum (XP) [1] is a human autosomal recessive disease, 
characterized by a high incidence of sunlight-induced skin cancer. People's 
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skin cells with this condition are hypersensitive to ultraviolet light, due 
to defects in the incision step of DNA excision repair. There are a minimum of 
seven genetic complementation groups involved in this pathway: XP-A to XP-G. 
XP-A is the most severe form of the disease and is due to defects in a 30 Kd 
5 nuclear protein called XPA (or XPAC) [2]. 

The sequence of the XPA protein is conserved from higher eukaryotes [3] to 
yeast (gene RAD14) [4]. XPA is a hydrophilic protein of 247 to 296 amino-acid 
residues which has a C4-type zinc finger motif in its central section. 

10 

Two signature were developed patterns for XPA proteins. The first corresponds to the 
zinc finger region, the second to a highly conserved region located some 12 residues after the 
zinc finger region. 

1 5 Consensus patternC-x-[DE]-C-x(3)-[LlVMF]-x(l,2)-D-x(2)-L-x(3)-F-x(4)-C-x(2)-C 
Consensus pattern[LIVM](2)-T-[KR]-T-E-x-K-x-[DE]-Y-[LIVMF](2)-x-D-x-[DE] 

[ 1] Tanaka K., Wood R.D. Trends Biochem. Sci. 19:83-86(1994). 

[ 2] Miura N., Miyamoto I., Asahina H., Satokata I., Tanaka K., Okada Y. J. Biol. Chem. 
2 0 266:19786-19789(1991). 

[ 3] Shimamoto T., Kohno K., Tanaka K., Okada Y. Biochem. Biophys. Res. Commun. 
181:1231-1237(1991). 

[ 4] Bankmann M., Prakash L., Prakash S. Nature 355:555-558(1992). 
25 1033. YCF9 

This family consists of the hypothetical protein product of the YCF9 gene from 
chloroplasts and cyanobacteria. Number of members: 16 

1034. (DUF15) 

30 

It is highly conserved between eubacteria and eukaryotes. 



Number of members: 30 
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1035. Lumenal portion of Cytochrome b559, alpha (gene psbE) subunit. (cytochr_b559a) 

This family is the lumenal portion of cytochrome b559 alpha chain, matches to this family 
5 should be accompanied by a match to the cytochr_b559 family also. The Prosite pattern 

pattern matches the transmembrane region of the cytochrome b559 alpha and beta subunits. 
Number of members: 16 

10 

A. Asparaginase 2 

Asparaginase II (L-asparagine aminohydrolase II) is an extracellular protein that may be 
associated with the cell wall and whose expression is affected by the availability of nitrogen. 
1 5 Asparaginase II catalyzes the reaction of L-Asparagine + H2O = L- Aspartate + NH3. As 

many leukemias have high requirements for aspartic acid, asparaginase II proteins are useful 
as reagents for screening compounds for activity as leukemia chemotherapy products. 
Asparaginase II protein can also be over- or under-expressed to alter amino acid content in 
plant tissues or to modify nitrogen fixation and/or nitrogen metabolism in plants. 

20 

Ref: Bon et al. (1997) Appl Biochem Biotechnol 63-65: 203-12 

B. Chloroa b-bind 

2 5 Chlorophyll a-b binding proteins are located in the thylakoid membranes of the chloroplast 
and bind chlorophyll a and chlorophyll b, thereby triggering a chemical reaction 
(photosynthesis). These proteins are useful in controlling the rate, efficiency and/or output of 
photosynthesis. Overexpression of chlorophyll a-b binding proteins is expected to increase 
the rate of photosynthesis. 

30 

Ref: Leutwiler et al. (1986) Nucleic Acids Res 14: 4051-64 
Brandt et al. (1992) Plant Mol Biol 19: 699-703 
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C. DMRL synthase 

DMRL Synthase (6,7-Dimethyl-8-Ribityllumazine Synthase) catalyzes the last step in 
riboflavin (Vitamin B2) synthesis, condensing 5-amino-6-(l'-D)-ribityl-amino-2,4(lH, 3H)- 
5 Pyrimidinedione with L-3,4-Dihydroxy-2-Butanone 4-Phosphate producing 6,7-Dimethyl-8- 
(l-D-Ribityl)Luminazine . The enzyme forms a homopentamer. Engineering of these 
proteins or those with homologous sequences/structures may allow control of the amounts of 
vitamin B2 available in plants and/or accumulation of pigment, as well as altering reactions 
requiring hydrogen ion carriers/transmitters. 

10 

Ref: Garcia-Ramirez et al. (1995) J Biol Chem 270: 23801-7 

D. El N 

1 5 These proteins are ATP-dependent DNA helicases that are required for initiation of viral 

DNA replication. They form a complex with the viral E2 protein. The E1-E2 complex binds 
to the replication origin that contains binding sites for both proteins. The majority of 
sequences known for this group of proteins are from various papillomaviruses, a type of 
double stranded DNA virus. In plants, the prototype double stranded DNA virus is 

2 0 Cauliflower Mosaic virus (CaMV). Manipulation of these proteins, especially to produce 
variant proteins that form non-productive complexes, enables production of plants that are 
resistant to infection by double stranded DNA viruses. 

Ref: Yang et al. (1993) PNAS USA 90: 5086-90 
2 5 Ustav and Stenlund (1991) EMBO J 10: 449-57 

Callaway et al. (1996) Mol Plant Microbe Interact 9: 810-8 

E. EFl G 

30 Elongation Factor-1 is composed of four subunits: alpha, beta, delta and gamma. Gamma 
subunits are presumed to play a role in anchoring the complex to other cellular components. 
Studies of EF-1 genes in plants suggests that different forms of the EF-1 subunits may be 
expressed in particular organs or in response to stress. Manipulation of the activity of these 
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proteins, either by altered expression level or by structural mutation, may result in the 
accumulation of a particular protein in a chosen organ or allow production of particular 
proteins during stress conditions. 



5 Ref: Kinzy et al. (1994) NAR 22: 2703-7 

Dunn et al. (1993) Plant Mol Biol 23: 221-5 
Aguilar et al. (1991) Plant Mol Biol 17: 351-60 



F. ENV_polvprotein 

10 

This family comprises the envelope or coat proteins known from a number of different 
retroviruses. In mammalian species, retroviruses are responsible for diseases such as 
leukemia and HIV. In plants, retroviruses are known in both monocot (e.g. Zeon-1) and dicot 
(e.g. Arabidopsis and tobacco) species and have been shown to induce mutant alleles at new 
1 5 loci. Engineering of plant ENV proteins may allow mobilization or targeting of endogenous 
or introduced retroviruses, in essence generating a new method for mutant production, gene 
tagging and the like. 



Ref: Mamoun et al (1990) J Virol 64: 4180-8 
2 0 Grandbastien et al. (1989) Nature 337: 376-80 

Wright and Voytas (1998) Genetics 149: 703-15 



G. Glvcosvl hvdr9 

25 

Proteins having this domain (previously known as the glycosyl hydrolase family 5 domain) 
catalyze the endohydrolysis of 1,4-p-D-glucosidic linkages in cellulose. Numerous plant 
proteins with this domain exist and are expressed in an organ specific manner. They are 
involved in the fruit ripening process, in cell elongation and plant reproduction. Modulation 
3 0 of the activity of these proteins, either by over- or under-expression or by mutation of the 
polypeptide, could be used to affect post-harvest physiology (e.g. rate of ripening) or for 
engineering reproductive sterility. 
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Ref : Giorda et al. (1990) Biochemistry 29: 7264-9 
Tucker et al. (1988) Plant Physiol 88: 1257-62 
Shani et al. (1997) 43: 837-42 

Milligan and Gasser (1995) Plant Mol Biol 28: 691-711 
H. Glycosyl hvdrl4 



The p-amylases (family 14 of glycosyl hydrolases) catalyze the hydrolysis of 1,4-a- 
glucosidic linkages in polysaccharides and remove successive maltose units from the non- 

1 0 reducing ends of the chains. Mutants of |3-amylase in Arabidopsis exhibited altered 

degradation of starch throughout the diurnal cycle. In addition, the mutant phenotypes 
indicated that these enzymes not only affect carbohydrate metabolism/catabolism, but also 
influence the amount of pigment stored within particular cells. Manipulation of the (3-amylase 
genes enables control of plant pigmentation (for example, fibre pigment in cotton) as well as 

1 5 carbohydrate synthesis and degradation. 



Zeeman et al. (1998) Plant J 15: 357-65 

Hirano and Nakamura (1997) Plant Physiol 114: 5675-82 

Kitamoto et al. (1988) J Bacteriol 170: 5848-54 



I. Glycosyl hvdrlS 



Glycosyl hydrolases from family 15 (such as 1,4-Alpha-D-Glucan glucohydrolase,) catalyze 

2 5 the hydrolysis of terminal 1,4-linked alpha-D-glucose residues successively from the non- 

reducing ends of the chains resulting in the release of (3-D-Glucose. In plants these proteins 
have been tied to the mobilization of the xyloglucan stored in the cotyledonary cell walls. 
Proteins such as these could be varied to affect the rate of plant growth (for example during 
germination), storage and/or use of glucose and other sugars by plant tissues and alteration of 

3 0 the properties, such as elasticity, of plant cell walls. 



Ref: Crombie et al. (1998) Plant J 15: 27-38 

Hata et al. (1991) Agric Biol Chem 55: 941-9 
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J. Glvcosvl hvdr20 



Members of the family 20 glycosyl hydrolases catalyze the hydrolysis of terminal non- 
5 reducing N-acetly-D-hexosamine residues in N-acetyl-P-D-hexosaminides. N-acetyl-j3 - 
glucosaminidase belongs to this family and exists in several different forms (consisting of 
various combinations of alpha and beta chains) depending on the organism. Family 20 
glycosyl hydrolases have been implicated in lysosomal storage diseases (such as Sandhoff 
disease) and glycogen storage disease in humans. These types of proteins are also 
1 0 responsible for the hydrolysis of chitin. In plants, these proteins could be useful in 

controlling carbohydrate catabolism, thereby influencing the amount of sugars available for 
storage and/or use in other metabolic pathways. In addition, it is possible that such proteins 
could be used to engineer an endogenous insect protection mechanism, e.g. by secretion of a 
chitin-hydrolyzing composition by the plant. 

15 

Ref: Graham et al (1988) J Biol Chem 263: 16823-9 
O'Dowd et al. (1988) Biochemistry 27: 5216-26 



K. HMO box 

20 

The HMG box is a novel type of DNA-binding domain found in a diverse group of proteins. 
Numerous plant proteins contain this domain, such as the HMGl/2-like proteins. The 
expression of some of these HMG proteins appears to be regulated by circadian rhythms and 
in a light dependent manner, occurring at higher levels in roots, for example and lower levels 
25 in light-grown tissues such as cotyledons. Generally, HMG proteins are thought to influence 
transcription regulation. In plants, HMGs are believed to have a role in maintaining patterns 
of circadian-regulated expression for other genes, suggesting that these proteins could be 
exploited to control growth and development. 



30 Ref: Laudet et al. (1993) Nucleic Acids Res 21: 2493-501 
Zheng et al. (1993) Plant Mol Biol 23: 813-23 
Grasser et al. (1993) Plant Mol Biol 23: 619-25 
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L. IL2 

Interleukin-2 (IL-2)is produced in mammals by T cells in response to antigenic or mitogenic 
stimulation and is crucial for proper regulation and functioning of the immune response. IL-2 
is capable of stimulating B cells, monocytes, lymphokine-activated killer cells, natural killer 
cells and glioma cells. Plant extracts have also been shown to stimulate the immune system 
(for example, mistletoe therapy for human cancer). It is known that IL-2 is involved in 
feedback inhibition pathways that impact the inflammatory response as well as the growth 
inhibition of tumor reactive T cells. Plant proteins containing IL-2-like sequences are useful 
as immunity-based therapeutics, acting in a manner similar to IL-2 in mammals. 

Ref: Heike et al. (1997) Scand J Immunol 45: 221-6 
Ariel et al. (1998) J Immunol 161: 2465-72 
Schink (1997) Anticancer Drugs 8 Suppl 1: S47-51 

M. Oxidored FMN 

NADPH dehydrogenases catalyze the reaction NADPH + acceptor = NADP(-i-) + reduced 
acceptor. One member of this family is yeast "old yellow enzyme" (OYE) and is thought to 
be involved in oxylipin metabolism. A second yeast family member is a protein that binds 
estrogen binding protein (EBP) in addition to exhibiting oxidoreductase activity. An 
Arabidopsis homolog to OYE has been described and estrogen binding proteins in plants 
have been reported. Plant proteins from this class have the potential to be used to modify 
lipid metabolism/catabolism. These proteins may also have use as therapeutics for breast and 
prostate cancer, and other abnormal growth in steroid-sensitive tissues. 

Ref: Baker et al. (1998) Proc Soc Exp Biol Med 217: 317-21 
Schaller and Weiler (1997) J Biol Chem 272: 28066-72 
Mandani et al. (1994) PNAS USA 91: 922-6 



N. Oxidored_q2 
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The NADH-plastoquinone oxidoreductases catalyze the reaction NADH + plastoquinone = 
NAD(+) + plastoquinol. In plants these reactions occur in the chloroplast and are believed to 
participate in a chloroplast respiratory system. Here, the NDH complex is postulated to act as 
a valve to remove excess reduction equivalents in the chloroplasts. Manipulation of these 
5 proteins may improve the rate or efficiency of photosynthesis. 



Ref: Burrows et al. (1998) EMBO J 17: 868-76 

Kofer et al (1998) Mol Gen Genet 258: 166-73 
Maier et al. (1995) J Mol Biol 251: 614-28 

O. PABP 



Polyadenylate binding proteins bind the poly (A) tail of mRNA. Plants, as exemplified by 
Arabidopsis, contain numerous PABP genes that are expressed in an organ-specific manner. 
15 For example, PABP2 is functional in roots and shoots, while PABP5 is expressed 

predominantly in immature flowers. The PABP proteins are implicated in numerous aspects 
of posttranscriptional regulation including mRNA turnover and translational initiation. 
Control of activity of PABP proteins provides the ability to control the expression of various 
genes in particular organs during development. 

20 

Ref: Hilson et al (1993) Plant Physiol 103: 525-33 

Belostotsky and Meagher (1993) PNAS USA 90: 6686-90 



P. Parvo coat 

25 

Parvoviruses are linear single-stranded DNA viruses that are encapsulated by three capsid 
proteins. Plants are susceptible to infection by single stranded DNA viruses such as Maize 
streak virus (MSV) and various Gemini viruses. The coat proteins in these plant viruses are 
critical to the virus life cycle within the plant. For example, the coat protein of MSV is 
3 0 thought to be involved in intra- and inter-cellular movement within the plant. Engineering of 
proteins having similarity to parvoviral coat proteins, especially to produce proteins that 
interfere with maturation of the virus particle, enables the production of plants having better 
resistance to natural plant single-stranded DNA viruses. 
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Ref: Liu et al. (1997) J Gen Virol 78: 1265-70 
Rohde et al. (1990) Virology 176: 648-51 

5 O. Pkinase C 

Plant serine/threonine protein kinases possessing this domain are expressed in all tissues and 
are known to undergo serine-specific autophosphorylation and specifically phosphorylate two 
ribosomal proteins, P14 and P16. During development, these proteins predominate during 

1 0 high metabolic activity in growing buds, root tips, leaf margins and germinating seeds. They 
are thought to be involved in the control of plant growth and development. In addition, two 
genes encoding proteins from this family have been described that help plant cells adapt 
during cold or high salt stresses. Consequently, engineering Pkinase C proteins provides a 
way to control general growth/development of the plant as well as a means to provide 

1 5 endogenous protection against environmental stresses. 

Ref: Zhang et al. (1994) J Biol Chem 269: 17586-92 

Mizoguchi et al. (1995) FEES Lett 358: 199-204 

20 R. REV 

The REV proteins act post-transcriptionally to relieve negative repression of GAG and ENV 
production in retroviruses such as Human Immounodeficiency Virus type I (HIV-1). Plants 
contain retrovirus-like viruses such as pararetroviruses and retrotransposons (i.e. transposons 
2 5 having long terminal repeats). Plant retrotransposons in particular have been used to create 
mutations at various loci, thereby permitting gene isolation, gene tagging and the like. 
Manipulation of plant REV proteins enables control of transposition frequencies of 
corresponding transposable elements and provides a new tool for genetic engineering of 
plants. 

30 

Ref: Sodroski et al. (1986) Nature 321: 412-7 

Franchini et al. (1989) PNAS USA 86: 2433-7 
Marquet et al. (1995) 77: 113-24 
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U. Signal 

Many plant proteins in this family contain sequences similar to those found in both 
components of the prokaryotic family of signal transducers known as the two-component 
systems. This suggests that activation may require a transfer of a phosphate group between 
the transmitter domain and the receiver domain. One family member in Arabidopsis appears 
to be involved in ethylene (a plant hormone) signal transduction. Other proteins in this family 
appear to be involved in the regulation of gene transcription under conditions of 
environmental stress. Signal proteins can be exploited to affect plant growth and development 
and/or control plant responses to stress conditions such as cold, nutrient availability, etc. 

Ref: Chang et al. (1993) Science 262: 539-44 
Nagaya et al. (1993) Gene 131: 119-124 
Gottfert et al. (1990) PNAS USA 87: 2680-4 

V. vMSA 

vMSA proteins are major surface antigens presenting on the envelope of various 
retroviruses. Surface antigens of retroviruses are often involved in tropism of the virus. 
Plants contain retrovirus-like viruses such as pararetroviruses and retrotransposons (i.e. 
transposons having long terminal repeats). Plant retrotransposons in particular have been 
used to create mutants at various loci, thereby permitting gene isolation, gene tagging and the 
like. Manipulation of plant vMSA proteins enables control of tropism of plant retroviruses 
that might be used for genetic engineering tools, thus enabling targeting of the virus to 
particular species and/or tissues of plants. 



Ref: Okamoto et al. (1988) J Gen Virol 69: 2575-83 
Grandbastien et al. (1989) Nature 337: 376-80 
Wright and Voytas (1998) Genetics 149: 703-15 



W. zf-CCCH 
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This family of proteins is defined by having two CX(8)CX(5)CX(3)H-type zinc finger 
domains. These proteins cover a broad range of functions. For example, the COPl protein 
acts as a repressor of photomorphogenesis in darkness; light stimuli abolish this suppressive 
action. In addition, COPl protein can function as a negative transcriptional regulator capable 
5 of direct interaction with components of the G-protein signaling pathway. As a second 
example, a zf-CCCH protein identified in Arabidopsis appears to be involved in the 
resistance to DNA damage induced by UV light and chemical DNA-damaging agents. 
Overexpression of this class of proteins permits production of plants that are better suited to 
adverse environments. Manipulation of expression of zf-CCCH proteins functioning as 
10 transcriptional regulators, such as COPl, enables manipulation of some signal transduction 
pathways. 



Ref: Pang et al. (1993) Nucleic Acids Res 21: 1647-53 
Deng et al. (1992) Cell 71: 791-801 



Proteins falling within this category contain many X-X-F-G and X-F-X-F-G repeats, and may 
contain RANBPl-like or PPIase domains. Plant proteins having domains similar to these 
2 0 include PASl and GMSTI. PASl has been shown to have dramatic developmental affects 

that appear to be correlated with both cell division and cell wall elongation. GMSTI has high 
identity to the yeast STI stress-inducible gene and has been shown to be heat inducible. 
Proteins such as these may be useful for controlling growth and form of development. 



2 5 Ref: Vittorioso et al. (1998) Mol Cell Biol 18: 3034-43 
Hernandez Torres et al. (1995) 27: 1221-6 



Y. Peptidase M48. 



3 0 Proteins belonging to this peptidase family are metallopro teases that bind zinc as a co factor 
and are located in the membranes of the endoplasmic reticulum. They function in NH2- 
terminal proteolytic processing, as shown for the yeast STE24 gene product. This gene is 
required for the correct processing of a-factor, a yeast pheromone. Family M48 peptidases 
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also appear to be required for some prenylation reactions, mediating COOH-terminal CAAX 
processing. Prenylation reactions are believed to be involved in the regulation of protein- 
protein and protein-membrane interactions. As an example, RAS GTPase activity is 
regulated in part by localization to the inner side of the plasma membrane upon prenylation. 
5 In plants, proteins from this family could be involved in pollen-stigma interactions such as 
those mediating self-pollenation vs. outcrossing, or could be members of several secondary 
metabolism pathways. 

Ref: Fujimura-Kamada et al. (1997) J Cell Biol. 136: 271-85. Tam et al. (1998) J Cell 
10 Biol. 142: 635-49. 

Z. DNA Pol Viral N 

The DNA pol Viral N domain is located at the N-terminal region of DNA polymerase 
isolated from several retroid viruses such as the Cauliflower Mosaic Virus. The domain 

1 5 motif has also been found in numerous other species from humans to cyanobacteria. In these 
organisms, this motif seems to be associated with two types of sequences; retrotransposons 
and mitochondrial genes. In the mitochondrial sequences this domain is potentially involved 
in the self-splicing conducted by group II introns. Various manipulations of this gene in 
plants allows control of the numerous retrotransposons endogenous to plant genomes or 

2 0 allows engineering of mitochondrial function, especially to increase efficiency of energy 
utilization by cells. 

REF: Chapdelaine and Bonen (1991) Ceil 65: 465-72 
Ferat and Miche (1993) Nature 364: 358-61 
2 5 Wilson et al. (1994) 368: 32-8 

Cambareri et al. (1994) 242: 658-65 
Gaardner et al. (1981) NAR 9: 2871-2888 
Cummings et al. (1990) Curr Genet 17: 375-402 
Hattori et al. (1986) Nature 321: 625-8 

30 

Aa. Calpaininhib 
This domain is found in calpastatin, an inhibitor protein specific for calpain. Calpain 
is a non-lysosomal calcium-dependent intracellular protease that appears to be involved in 
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the dynamic changes of the cytoskeleton, especially actin-related structures, during early 
Drosophila embryogenesis [1]. Calpastatins co-exist in cells with calpains and the subcellular 
distribution of calpastatin is thought to be important to calpain regulation [2]. In plants 
calpains and calpastatins could be involved in embryogenesis and non-embryogenic organ 
5 reiteration. Mutations occurring in calpain inhibitor repeat domains would produce 
developmental abnormalities such as abnormal leaf, root or flower development. 

Refs 

1 Emori Y and Saigo K (1994) J Biol Chem 269: 25137-42. 
10 2 Mellgren RL, Lane RD, Mericle MT (1989) Biochim Biophys Acta 999: 71-77. 

Ab. chorismate_bind 
Chorismate binding domains are present in plant anthranilate synthase (AS) genes. AS 
genes catalyze the first step in the biosynthesis of tryptophan by converting chorismate and 
1 5 L-glutamine to anthranilate, pyruvate and L-glutamate. Some of these genes are involved in 
feedback inhibition by tryptophan [1] while some are feedback insensitive [2]. In 
Arabidopsis, two AS genes have overlapping, but different distributions. One of these AS 
genes is induced by wounding and bacterial pathogen infiltration [1]. Mutations in the 
chorismate binding domain would affect the production of tryptophan and could influence the 

2 0 plant's defense system. AS gene products can be used for in vitro synthesis of tryptophan 

and tryptophan derivatives. 

Refs 

1 Niyogi KK, Fink GR (1992) Plant Cell 4: 721-33. 
25 2 Song HS, Brotherton JE, Gonzales RA, Wilholm JM (1998) Plant Physiol 1 17:533- 
43. 

Ac. late_protein_L2 
Papillomaviruses are encapsulated double stranded DNA viruses. Plants are susceptible to 

3 0 infection by double stranded DNA viruses such as Cauliflower Mosaic virus (CaMV). The 

coat proteins in these plant viruses are critical to the virus life cycle within the plant. For 
example, the coat protein of CaMV is thought to be involved in intra- and inter-cellular 
movement within the plant [1]. Engineering of proteins having similarity to papillomavirus 
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coat proteins may enable the production of plants having better resistance to natural plant 
double stranded DNA viruses. 

Refs 

5 1 Thompson SR, Melcher U (1993) J Gen Virol 74: 1141-8. 

Ad. Peptidase_M41 
Proteins belonging to this peptidase family are metalloproteases that bind zinc as a cofactor 
and are integral membrane proteins. They seem to be involved in the degradation of carboxy- 
1 0 terminal-tagged cytoplasmic proteins. In plants, these proteins are located in the thylakoid 
membranes of the chloroplasts, their expression is light regulated and they are thought to be 
involved in degradation of soluble stromal proteins and turn-over of thylkoid proteins [1]. 
Manipulation of expression and structure of these proteins would have effects on the 
efficiency of photosynthesis and the development of chloroplasts. 

15 

Refs 

1 Lindahl M, Tabak s, Cseke L, Pichersky E, Andersson B, Adam Z (1996) J Biol 
Chem 271: 29329-34. 

20 Ae. UPF0051 

There is some evidence that, in plants, proteins in this family are involved in ATP synthesis 
in chloroplasts [1, 2]. Mutations in these proteins or altering their expression would affect 
the efficiency of photosynthesis and energy production. 

25 Refs 

1 Kostrzewa M, Zetsche K (1992) J Mol Biol 227: 961-70. 

2 Kostrzewa M, Zetsche K (1993) Plant Mol Biol 23: 67-76 

A£ E7 

3 0 Papillomaviruses are encapsulated double stranded DNA viruses. The Papillomavirus early 
protein 7 (E7) is known as a potent immortalizing and transforming agent. Transformation by 
E7 is thought to be mediated by the physical association of E7 with cellular proteins 
regulating entry into the cell cycle [1]. The result is entry into the cell cycle and suppression 
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of terminal differentiation in mammalian cells. Thus, engineering of proteins having 
similarity to papillomavirus E7 protein enables the production of plants having altered 
cellular proliferation characteristics and possibly altered morphology. For example, 
overexpression of E7-like proteins vs^ould be expected to result in proliferation of cells of the 
5 tissue in which the E7 protein is expressed, perhaps with suppression of differentiation 

events. Thus, for example, overexpression of E7-like proteins in meristem cells can result in 
taller plants and suppression of leafing and/or flowering. 

Refs 

10 1 Zwerschke W, Jansen-Durr P Adv Cancer Res 2000;78: 1-29 
Ag. Peptidase U7 

This protein is known to be an integral membrane protein in the cyanobacterium 
Synechocystis where it functions to digest cleaved signal peptides [1]. This activity is 
15 necessary to maintain proper secretion of mature proteins across the membrane. In higher 
plants this protein may be present in the plastid or chloroplast membranes where it would 
function by enabling protein movement into and out of the chloroplasts. Mutations in this 
protein would be expected to affect the development of plastids, including chloroplasts, or 
alter the energy transfer system within the chloroplasts, thereby affecting growth and 

2 0 development. 

Refs 

1 Kaneko T, Sato S, Kotani H, Tanaka A, Asamizu E, Nakamura Y, Miyajima N, 

Hirosawa M, Sugiura M, Sasamoto S, Kimura T, Hosouchi T, Matsuno A, Muraki A, 
Nakazaki N, Naruo K, Okumura S, Shimpo S, Takeuchi C, Wada T, Watanabe A, 
25 Yamada M, Yasuda M, Tabata S (1996) DNA Res 3:109-36. 

Ah. 5 '-3' Exonuclease 

The 5'-3' exonuclease domain is one found in bacterial DNA polymerases I and in yeast DNA 
repair enzymes such as Exonuclease I. Yeast Exo I is involved in mitotic recombination and 

3 0 also includes a domain that interacts with the mismatch repair protein MSH2. The 5 '-3' 

exonuclease domain is also present in XPG DNA repair enzymes in humans and in yeast 
RAD9 protein. Defects in XPG proteins result in Xeroderma Pigmentosum. Thus defects in 
5'-3' exonuclease domain-containing proteins in plants are expected to lead to defects in DNA 
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repair and corresponding high spontaneous and inducible mutation rates. Consensus 
sequence: 



IMKKKLLLVDGSSLAFRAFFALPPLTNSAGEPTNAVYGFLKMLIKLIEQEQPTHIAW 
5 FDAKAKTFRHELYEGYKAGRAP 

TPDELREQIPLIKELLDALGIPLLEVAGYEADDVIGTLAKLAEKEGYEVLIVTGDRDLL 
QLVSDHVTVIITKKGIAEFTL 

FTPEAVIEKYGLTPEQIIDYKALMGDSSDNIPGVKGIGEKTAAKLLQEYGSLEGIYANL 
DKLKGKKLREKLLAHKEDAKL 
1 0 SRDLATIKTD VPLDLTLDDLRLPDPDRDALDLLFDE 



Ref: 

Fiorentini P. et al. RT. Mol. Cell. Biol. 17:2764-2773(1997). 
Tishkoff et al. Cancer Res. 0:0-0(1998). 
15 Macinnes M.A. et al. Mol. Cell. Biol. 13:6393-6402(1993). 
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AA. Activities of Polypeptides Comprising Signal Peptides 

Polypeptides comprising signal peptides are a family of proteins that are typically 
targeted to (1) a particular organelle or intracellular compartment, (2) interact with a 
5 particular molecule or (3) for secretion outside of a host cell. Example of polypeptides 
comprising signal peptides include, without limitation, secreted proteins, soluble proteins, 
receptors, proteins retained in the ER, etc. 

These proteins comprising signal peptides are useful to modulate ligand-receptor 
1 0 interactions, cell-to-cell communication, signal transduction, intracellular communication, 

and activities and/or chemical cascades that take part in an organism outside or within of any 
particular cell. 

One class of such proteins are soluble proteins which are transported out of the cell. 
1 5 These proteins can act as ligands that bind to receptor to trigger signal transduction or to 
permit communication between cells. 

Another class is receptor proteins which also comprise a retention domain that lodges 
the receptor protein in the membrane when the cell transports the receptor to the surface of 
2 0 the cell. Like the soluble ligands, receptors can also modulate signal transduction and 
communication between cells. 

In addition the signal peptide itself can serve as a ligand for some receptors. An 
example is the interaction of the ER targeting signal peptide with the signal recognition 
2 5 particle (SRP). Here, the SRP binds to the signal peptide, halting translation, and the 
resulting SRP complex then binds to docking proteins located on the surface of the ER, 
prompting transfer of the protein into the ER. 



30 



A description of signal peptide residue composition is described below in Subsection 

IV.C.l. 
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III. Methods of Modulating Polypeptide Production 

It is contempiated that polynucleotides of the invention can be incorporated into a 
host cell or in-vitro system to modulate polypeptide production. For instance, the SDFs 
prepared as described herein can be used to prepare expression cassettes useful in a number of 
5 techniques for suppressing or enhancing expression. 

An example are polynucleotides comprising sequences to be transcribed, such as 
coding sequences, of the present invention can be inserted into nucleic acid constructs to 
modulate polypeptide production. Typically, such sequences to be transcribed are 
heterologous to at least one element of the nucleic acid construct to generate a chimeric gene 
10 or construct. 

Another example of useful polynucleotides are nucleic acid molecules comprising 
regulatory sequences of the present invention. Chimeric genes or constructs can be generated 
when the regulatory sequences of the invention linked to heterologous sequences in a vector 
construct. Within the scope of invention are such chimeric gene and/or constructs. 
1 5 Also within the scope of the invention are nucleic acid molecules, whereof at least a part 

or fragment of these DNA molecules are presented in TABLE 1 of the present application, and 
wherein the coding sequence is under the control of its own promoter and/or its own regulatory 
elements. Such molecules are useful for transforming the genome of a host cell or an organism 
regenerated from said host cell for modulating polypeptide production. 

2 0 Additionally, a vector capable of producing the oligonucleotide can be inserted into the 

host ceil to deliver the oligonucleotide. 

More detailed description of components to be included in vector constructs are 
described both above and below. 

Whether the chimeric vectors or native nucleic acids are utilized, such 
25 polynucleotides can be incorporated into a host cell to modulate polypeptide production. 

Native genes and/or nucleic acid molecules can be effective when exogenous to the host cell. 
Methods of modulating polypeptide expression includes, without limitation: 
Suppression methods, such as 
Antisense 

3 0 Ribozymes 

Co-suppression 

Insertion of Sequences into the Gene to be Modulated 
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Regulatory Sequence Modulation. 



as well as Methods for Enhancing Production, such as 
Insertion of Exogenous Sequences; and 
5 Regulatory Sequence Modulation. 



III.A. Suppression 

Expression cassettes of the invention can be used to suppress expression of 
endogenous genes which comprise the SDF sequence. Inhibiting expression can be useful, 
for instance, to tailor the ripening characteristics of a fruit (Oeller et al.. Science 254 :437 
1 0 (1991)) or to influence seed size_(WO98/07842) or to provoke cell ablation (Mariani et al.. 
Nature 357: 384-387 (1992). 

As described above, a number of methods can be used to inhibit gene expression in 
plants, such as antisense, ribozyme, introduction of exogenous genes into a host cell, 
insertion of a polynucleotide sequence into the coding sequence and/or the promoter of the 
1 5 endogenous gene of interest, and the like. 

III. A. 1. Antisense 

An expression cassette as described above can be transformed into host cell or 
plant to produce an antisense strand of RNA. For plant cells, antisense RNA inhibits gene 
expression by preventing the accumulation of mRNA which encodes the enzyme of interest, see, 
2 0 e.g., Sheehy et al., Proc. Nat. Acad. Sci. USA, 85:8805 (1988), and Hiatt et al., U.S. Patent No. 
4,801,340. 



III.A.2.Ribozymes 

Similarly, ribozyme constructs can be transformed into a plant to cleave mRNA 
and down-regulate translation. 



2 5 III.A.3. Co-Suppression 

Another method of suppression is by introducing an exogenous copy of the gene 
to be suppressed. Introduction of expression cassettes in which a nucleic acid is configured in 
the sense orientation with respect to the promoter has been shown to prevent the accumulation of 
mRNA. A detailed description of this method is described above. 
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III.A.4. Insertion of Sequences into the Gene to be Modulated 

Yet another means of suppressing gene expression is to insert a polynucleotide 
into the gene of interest to disrupt transcription or translation of the gene. 

Homologous recombination could be used to target a polynucleotide insert to a 
gene using the Cre-Lox system (A.C. Vergunst et a\.. Nucleic Acids Res. 26:2729 (1998), A.C. 
Vergunst et al., Plant Mol. Biol. 38:393 (1998), H. Albert et al.. Plant J. 7:649 (1995)). 

In addition, random insertion of polynucleotides into a host cell genome can also 
be used to disrupt the gene of interest. Azpiroz-Leehan et al., Trends in Genetics 13:152 (1997). 
In this method, screening for clones from a library containing random insertions is preferred for 
identifying those that have polynucleotides inserted into the gene of interest. Such screening can 
be performed using probes and/or primers described above based on sequences from TABLE 1, 
fragments thereof, and substantially similar sequence thereto. The screening can also be 
performed by selecting clones or any transgenic plants having a desired phenotype. 

III.A.5. Regulatory SequenceModulation 

The SDFs described in Table 1, and fragments thereof are examples of 
nucleotides of the invention that contain regulatory sequences that can be used to suppress or 
inactivate transcription and/or translation from a gene of interest as discussed in I.C.5. 

III.A.6. Genes Comprising Dominant-Negative Mutations 

When suppression of production of the endogenous, native protein is desired it 
is often helpful to express a gene comprising a dominant negative mutation. Production of 
protein variants produced from genes comprising dominant negative mutations is a useful 
tool for research Genes comprising dominant negative mutations can produce a variant 
polypeptide which is capable of competing with the native polypeptide, but which does not 
produce the native result. Consequently, over expression of genes comprising these mutations 
can titrate out an undesired activity of the native protein. For example, The product from a 
gene comprising a dominant negative mutation of a receptor can be used to constitutively 
activate or suppress a signal transduction cascade, allowing examination of the phenotype 
and thus the trait(s) controlled by that receptor and pathway. Alternatively, the protein arising 
from the gene comprising a dominant-negative mutation can be an inactive enzyme still capable 
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of binding to the same substrate as the native protein and therefore competes with such native 
protein. 

Products from genes comprising dominant-negative mutations can also act upon 
the native protein itself to prevent activity. For example, the native protein may be active only 
as a homo-multimer or as one subunit of a hetero-multimer. Incorporation of an inactive subunit 
into the multimer with native subunit(s) can inhibit activity. 

Thus, gene function can be modulated in host cells of interest by insertion into 
these cells vector constructs comprising a gene comprising a dominant-negative mutation. 

III.B. Enhanced Expression 

Enhanced expression of a gene of interest in a host cell can be accomplished by either 
(1) insertion of an exogenous gene; or (2) promoter modulation. 

III.B.l. Insertion of an Exogenous Gene 

Insertion of an expression construct encoding an exogenous gene can boost the 
number of gene copies expressed in a host cell. 

Such expression constructs can comprise genes that either encode the native 
protein that is of interest or that encode a variant that exhibits enhanced activity as compared to 
the native protein. Such genes encoding proteins of interest can be constructed from the 
sequences from TABLE 1, fragments thereof, and substantially similar sequence thereto. 

Such an exogenous gene can include either a constitutive promoter permitting 
expression in any cell in a host organism or a promoter that directs transcription only in 
particular cells or times during a host cell life cycle or in response to environmental stimuli. 



1II.B.2. Regulatory Sequence Modulation 

The SDFs of Table 1, and fragments thereof, contain regulatory sequences that 
can be used to enhance expression of a gene of interest. For example, some of these sequences 

25 contain useful enhancer elements. In some cases, duplication of enhancer elements or insertion 
of exogenous enhancer elements will increase expression of a desired gene from a particular 
promoter. As other examples, all 11 promoters require binding of a regulatory protein to be 
activated, while some promoters may need a protein that signals a promoter binding protein to 
expose a polymerase binding site. In either case, over-production of such proteins can be used 

30 to enhance expression of a gene of interest by increasing the activation time of the promoter. 
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Such regulatory proteins are encoded by some of the sequences in TABLE 1, 
fragments thereof, and substantially similar sequences thereto. 

Coding sequences for these proteins can be constructed as described above. 



5 IV. Gene Constructs and Vector Construction 

To use isolated SDFs of the present invention or a combination of them or parts and/or 
mutants and/or fusions of said SDFs in the above techniques, recombinant DNA vectors which 
comprise said SDFs and are suitable for transformation of cells, such as plant cells, are usually 
prepared. The SDF construct can be made using standard recombinant DNA techniques 

1 0 (Sambrook et al. 1989) and can be introduced to the species of interest by Agrobacterium- 
mediated transformation or by other means of transformation {e.g., particle gun 
bombardment) as referenced below. 

The vector backbone can be any of those typical in the art such as plasmids, viruses, 
artificial chromosomes, BACs, YACs and PACs and vectors of the sort described by 

1 5 (a) BAC: Shizuya et al., Proc. Natl. Acad. Sci. USA 89: 8794-8797 (1992); 

Hamilton et al., Proc. Natl. Acad. Sci. USA 93: 9975-9979 (1996); 

(b) YAC: Burke et al., Science 236:806-812 (1987);. 

(c) PAC: Sternberg N. et al., Proc Natl Acad Sci USA. Jan;87(l):103-7 (1990); 

(d) Bacteria-Yeast Shuttle Vectors: Bradshaw et al., Nucl Acids Res 23: 4850- 
2 0 4856(1995); 

(e) Lambda Phage Vectors: Replacement Vector, e.g., 
Frischauf et al., J. Mol Biol 170: 827-842 (1983); or Insertion vector, e.g., 

Huynh et al., In: Glover NM (ed) DNA Cloning: A practical Approach, Vol.l Oxford: IRL 
Press (1985); 

2 5 (f) T-DNA gene fusion vectors :Walden et al., Mol Cell Biol 1: 175-194 (1990); 

and 

(g) Plasmid vectors: Sambrook et al., infra. 

Typically, a vector will comprise the exogenous gene, which in its turn comprises an 
SDF of the present invention to be introduced into the genome of a host cell, and which gene 

3 0 may be an antisense construct, a ribozyme construct chimeraplast, or a coding sequence with 

any desired transcriptional and/or translational regulatory sequences, such as promoters, UTRs, 



Reference No. 2750-942P 

841 

and 3' end termination sequences. Vectors of the invention can also include origins of 
replication, scaffold attachment regions (SARs), markers, homologous sequences, introns, etc. 

A DNA sequence coding for the desired polypeptide, for example a cDNA sequence 
encoding a full length protein, will preferably be combined with transcriptional and translational 
5 initiation regulatory sequences which will direct the transcription of the sequence from the gene 
in the intended tissues of the transformed plant. 

For example, for over-expression, a plant promoter fragment may be employed that will 
direct transcription of the gene in all tissues of a regenerated plant. Alternatively, the plant 
promoter may direct transcription of an SDF of the invention in a specific tissue (tissue-specific 
1 0 promoters) or may be otherwise under more precise environmental control (inducible 
promoters). 

If proper polypeptide productionis desired, a polyadenylation region at the 3'-end of the 
coding region is typically included. The polyadenylation region can be derived from the natural 
gene, from a variety of other plant genes, or from T-DNA. 

1 5 The vector comprising the sequences from genes or SDF or the invention may 

comprise a marker gene that confers a selectable phenotype on plant cells. The vector can 
include promoter and coding sequence, for instance. For example, the marker may encode 
biocide resistance, particularly antibiotic resistance, such as resistance to kanamycin, G418, 
bleomycin, hygromycin, or herbicide resistance, such as resistance to chlorosulfuron or 

2 0 phosphinotricin. 

rV.A. Coding Sequences 

Generally, the sequence in the transformation vector and to be introduced into 
the genome of the host cell does not need to be absolutely identical to an SDF of the present 
invention. Also, it is not necessary for it to be full length, relative to either the primary 
2 5 transcription product or fully processed mRNA. Furthermore, the introduced sequence need not 
have the same intron or exon pattern as a native gene. Also, heterologous non-coding segments 
can be incorporated into the coding sequence without changing the desired amino acid sequence 
of the polypeptide to be produced. 



30 



IV.B. Promoters 

As explained above, introducing an exogenous SDF from the same species or an 
orthologous SDF from another species can modulate the expression of a native gene 
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corresponding to that SDF of interest. Such an SDF construct can be under the control of 
either a constitutive promoter or a highly regulated inducible promoter (e.g., a copper 
inducible promoter). The promoter of interest can initially be either endogenous or 
heterologous to the species in question. When re-introduced into the genome of said species, 
5 such promoter becomes exogenous to said species. Over-expression of an SDF transgene can 
lead to co-suppression of the homologous endogeneous sequence thereby creating some 
alterations in the phenotypes of the transformed species as demonstrated by similar analysis 
of the chalcone synthase gene (Napoli et al., Plant Cell 2:279 (1990) and van der Krol et al., 
Plant Cell 2:291 (1990)). If an SDF is found to encode a protein with desirable 
1 0 characteristics, its over-production can be controlled so that its accumulation can be 
manipulated in an organ- or tissue-specific manner utilizing a promoter having such 
specificity. 

Likewise, if the promoter of an SDF (or an SDF that includes a promoter) is found to 
be tissue-specific or developmentally regulated, such a promoter can be utilized to drive or 
1 5 facilitate the transcription of a specific gene of interest (e.g., seed storage protein or root- 
specific protein). Thus, the level of accumulation of a particular protein can be manipulated 
or its spatial localization in an organ- or tissue- specific manner can be altered. 

IV. C Signal Pepddes 

2 0 SDFs of the present invention containing signal peptides are indicated in Table 1. In 

some cases it may be desirable for the protein encoded by an introduced exogenous or 
orthologous SDF to be targeted (1) to a particular organelle intracellular compartment, (2) to 
interact with a particular molecule such as a membrane molecule or (3) for secretion outside 
of the cell harboring the introduced SDF. This will be accomplished using a signal peptide. 

2 5 Signal peptides direct protein targeting, are involved in ligand-receptor interactions 

and act in cell to cell communication. Many proteins, especially soluble proteins, contain a 
signal peptide that targets the protein to one of several different intracellular compartments. 
In plants, these compartments include, but are not limited to, the endoplasmic reticulum (ER), 
mitochondria, plastids (such as chloroplasts), the vacuole, the Golgi apparatus, protein 

3 0 storage vessicles (PSV) and, in general, membranes. Some signal peptide sequences are 

conserved, such as the Asn-Pro-Ile-Arg amino acid motif found in the N-terminal propeptide 
signal that targets proteins to the vacuole (Marty (1999) The Plant Cell 11: 587-599). Other 
signal peptides do not have a consensus sequence per se, but are largely composed of 
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hydrophobic amino acids, such as those signal peptides targeting proteins to the ER (Vitale 
and Denecke (1999) The Plant Cell 11: 615-628). Still others do not appear to contain either 
a consensus sequence or an identified common secondary sequence, for instance the 
chloroplast stromal targeting signal peptides (Keegstra and Cline (1999) The Plant Cell 11: 
557-570). Furthermore, some targeting peptides are bipartite, directing proteins first to an 
organelle and then to a membrane within the organelle (e.g. within the thylakoid lumen of the 
chloroplast; see Keegstra and Cline (1999) The Plant Cell 11: 557-570). In addition to the 
diversity in sequence and secondary structure, placement of the signal peptide is also varied. 
Proteins destined for the vacuole, for example, have targeting signal peptides found at the N- 
terminus, at the C-terminus and at a surface location in mature, folded proteins. Signal 
peptides also serve as ligands for some receptors. 

These characteristics of signal proteins can be used to more tightly control the 
phenotypic expression of introduced SDFs. In particular, associating the appropriate signal 
sequence with a specific SDF can allow sequestering of the protein in specific organelles 
(plastids, as an example), secretion outside of the cell, targeting interaction with particular 
receptors, etc. Hence, the inclusion of signal proteins in constructs involving the SDFs of the 
invention increases the range of manipulation of SDF phenotypic expression. The nucleotide 
sequence of the signal peptide can be isolated from characterized genes using common 
molecular biological techniques or can be synthesized in vitro. 

In addition, the native signal peptide sequences, both amino acid and nucleotide, 
described in Table 1 can be used to modulate polypeptide transport. Further variants of the 
native signal peptides described in Table 1 are contemplated. Insertions, deletions, or 
substitutions can be made. Such variants will retain at least one of the functions of the native 
signal peptide as well as exhibiting some degree of sequence identity to the native sequence. 

Also, fragments of the signal peptides of the invention are useful and can be fused with 
other signal peptides of interest to modulate transport of a polypeptide. 

V. Transformation Techniques 

A wide range of techniques for inserting exogenous polynucleotides are known for a 
number of host cells, including, without limitation, bacterial, yeast, mammalian, insect and plant 
cells. 
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Techniques for transforming a wide variety of higher plant species are well known and 
described in the technical and scientific literature. See, e.g. Weising et al., Rev. Genet. 
22:421 (1988); and Christou, Euphytica, v. 85, n.l-3:13-27, (1995). 

DNA constructs of the invention may be introduced into the genome of the desired plant 
5 host by a variety of conventional techniques. For example, the DNA construct may be 
introduced directly into the genomic DNA of the plant cell using techniques such as 
electroporation and microinjection of plant cell protoplasts, or the DNA constructs can be 
introduced directly to plant tissue using ballistic methods, such as DNA particle bombardment. 
Alternatively, the DNA constructs may be combined with suitable T-DNA flanking regions and 

1 0 introduced into a conventional Agrobacterium tumefaciens host vector. The virulence functions 
of \ht Agrobacterium tumefaciens host will direct the insertion of the construct and adjacent 
marker into the plant cell DNA when the cell is infected by the bacteria (McCormac et al., Mol. 
Biotechnol. 8:199 (1997); Hamilton, Gene 200:107 (1997)); Salomon et al. EMBOJ. 3:141 
(1984); Herrera-Estrella et al. EMBOJ. 2:987 (1983). 

1 5 Microinjection techniques are known in the art and well described in the scientific and 

patent literature. The introduction of DNA constructs using polyethylene glycol precipitation is 
described in Paszkowski et al. EMBOJ. 3:2717 (1984). Electroporation techniques are 
described in Fromm et al. Proc. Natl Acad. Sci. USA 82:5824 (1985). Ballistic transformation 
techniques are described in Klein et al. Nature 327:773 (1987). Agrobacterium 

2 0 tome/flc/ens-mediated transformation techniques, including disarming and use of binary or co- 
integrate vectors, are well described in the scientific literature. See, for example Hamilton, CM., 
Gene 200:107 (1997); Miiller et al. Mol. Gen. Genet. 207:171 (1987); Komari et al. Plant J. 
10:165 (1996); Venkateswarlu et al. Biotechnology 9:1103 (1991) and Gleave, AP., Plant Mol. 
Biol. 20:1203 (1992); Graves and Goldman, Plant Mol. Biol. 7:34 (1986) and Gould et al.. Plant 

2 5 Physiology 95:426 (1991). 

Transformed plant cells which are derived by any of the above transformation 
techniques can be cultured to regenerate a whole plant that possesses the transformed genotype 
and thus the desired phenotype such as seedlessness. Such regeneration techniques rely on 
manipulation of certain phytohormones in a tissue culture growth medium, typically relying on a 

3 0 biocide and/or herbicide marker which has been introduced together with the desired nucleotide 

sequences. Plant regeneration from cultured protoplasts is described in Evans et al.. Protoplasts 
Isolation and Culture in "Handbook of Plant Cell Culture," pp. 124-176, MacMillan Publishing 
Company, New York, 1983; and Binding, Regeneration of Plants, Plant Protoplasts, pp. 21-73, 
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CRC Press, Boca Raton, 1988. Regeneration can also be obtained from plant callus, explants, 
organs, or parts thereof. Such regeneration techniques are described generally in Klee et al. Ann. 
Rev. of Plant Phys. 38:467 (1987). Regeneration of monocots (rice) is described by Hosoyama 
et al. {Biosci. Biotechnol. Biochem. 58:1500 (1994)) and by Ghosh et al. (J. Biotechnol. 32:1 
5 (1994)). The nucleic acids of the invention can be used to confer desired traits on essentially any 
plant. 

Thus, the invention has use over a broad range of plants, including species from the 
genemAnacardium, Arachis, Asparagus, Atropa, Avena, Brassica, Citrus, Citrullus, Capsicum, 
Carthamus, Cocos, Cojfea, Cucumis, Cucurbita, Daucus, Elaeis, Fragaria, Glycine, Gossypium, 
1 0 Helianthus, Heterocallis, Hordeum, Hyoscyamus, Lactuca, Linum, Lolium,Lupinus, 

Lycopersicon, Malus, Manihot, Majorana, Medicago, Nicotiana, Olea, Oryza, Panieum, 
Pannesetum, Persea, Phaseolus, Pistachia, Pisum, Pyrus, Prunus, Raphanus, Ricinus, Secale, 
Senecio, Sinapis, Solanum, Sorghum, Theobromus, Trigonella, Triticum, Vicia, Vitis, Vigna, 
and, Zea. 

15 One of skill will recognize that after the expression cassette is stably incorporated in 

transgenic plants and confirmed to be operable, it can be introduced into other plants by 
sexual crossing. Any of a number of standard breeding techniques can be used, depending 
upon the species to be crossed. 

The particular sequences of SDFs identified are provided in the attached TABLE 1. 

2 0 One of ordinary skill in the art, having this data, can obtain cloned DNA fragments, synthetic 
DNA fragments or polypeptides constituting desired sequences by recombinant methodology 
known in the art or described herein. 



EXAMPLES 

The invention is illustrated by way of the following examples. The invention is not 
25 limited by these examples as the scope of the invention is defined solely by the claims 
following. 



EXAMPLE 1: cDNA PREPARATION 

A number of the nucleotide sequences disclosed in TABLE 1 herein as representative of 
the SDFs of the invention can be obtained by sequencing genomic DNA (gDNA) and/or cDNA 
3 0 from com plants grown from HYBRID SEED # 35A19, purchased from Pioneer Hi-Bred 
International, Inc., Supply Management, P.O. Box 256, Johnston, Iowa 50131-0256. 
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A number of the nucleotide sequences disclosed in TABLE 1 herein as representative 
of the SDFs of the invention can also be obtained by sequencing genomic DNA from 
Arabidopsis thaliana, Wassilewskija ecotype or by sequencing cDNA obtained from mRNA 
from such plants as described below. This is a true breeding strain. Seeds of the plant are 
5 available from the Arabidopsis Biological Resource Center at the Ohio State University, 

under the accession number CS2360. Seeds of this plant were deposited under the terms and 
conditions of the Budapest Treaty at the American Type Culture Collection, Manassas, VA 
on August 31, 1999, and were assigned ATCC No. PTA-595. 

Other methods for cloning full-length cDNA are described, for example, by Seki et 
1 0 al., Plant Journal 15:707-720 (1998) "High-efficiency cloning of Arabidopsis full-length 

cDNA by biotinylated Cap trapper"; Maruyama et al., Gene 138 :171 (1994) "Oligo-capping a 
simple method to replace the cap structure of eukaryotic mRNAs with oligoribonucleotides"; 
and WO 96/34981. 

Tissues were, or each organ was, individually pulverized and frozen in liquid 
1 5 nitrogen. Next, the samples were homogenized in the presence of detergents and then 

centrifuged. The debris and nuclei were removed from the sample and more detergents were 
added to the sample. The sample was centrifuged and the debris was removed. Then the 
sample was applied to a 2M sucrose cushion to isolate polysomes. The RNA was isolated by 
treatment with detergents and proteinase K followed by ethanol precipitation and 
2 0 centrifugation. The polysomal RNA from the different tissues was pooled according to the 
following mass ratios: 15/15/1 for male inflorescences, female inflorescences and root, 
respectively. The pooled material was then used for cDNA synthesis by the methods 
described below. 

Starting material for cDNA synthesis for the exemplary corn cDNA clones 

2 5 with sequences presented in TABLE 1 was poly(A)-containing polysomal mRNAs from 

inflorescences and root tissues of corn plants grown from HYBRID SEED # 35A19. Male 
inflorescences and female (pre-and post-fertilization) inflorescences were isolated at various 
stages of development. Selection for poly(A) containing polysomal RNA was done using 
oligo d(T) cellulose columns, as described by Cox and Goldberg, "Plant Molecular Biology: 

3 0 A Practical Approach", pp. 1-35, Shaw ed., c. 1988 by IRL, Oxford. The quality and the 

integrity of the polyA+ RNAs were evaluated. 
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Starting material for cDNA synthesis for the QxemplaTy Arabidopsis cDNA 
clones with sequences presented in TABLE 1 was polysomal RNA isolated from the top- 
most inflorescence tissues oi Arabidopsis thaliana Wassilewskija (Ws.) and from roots of 
Arabidopsis thaliana Landsberg erecta (L. er.), also obtained from the Arabidopsis 
5 Biological Resource Center. Nine parts inflorescence to every part root was used, as 

measured by wet mass. Tissue was pulverized and exposed to liquid nitrogen. Next, the 
sample was homogenized in the presence of detergents and then centrifuged. The debris and 
nuclei were removed from the sample and more detergents were added to the sample. The 
sample was centrifuged and the debris was removed and the sample was applied to a 2M 

10 sucrose cushion to isolate polysomal RNA. Cox et al., "Plant Molecular Biology: A Practical 
Approach", pp. 1-35, Shaw ed., c. 1988 by IRL, Oxford. The polysomal RNA was used 
for cDNA synthesis by the methods described below. Polysomal mRNA was then isolated as 
described above for corn cDNA. The quality of the RNA was assessed electrophoretically. 

Following preparation of the mRNAs from various tissues as described above, selection 

15 of mRNA with intact 5' ends and specific attachment of an oligonucleotide tag to the 5' end of 
such mRNA was performed using either a chemical or enzymatic approach. Both techniques 
take advantage of the presence of the "cap" structure, which characterizes the 5' end of most 
intact mRNAs and which comprises a guanosine generally methylated once, at the 7 position. 
The chemical modification approach involves the optional elimination of the 2', 3'-cis 

2 0 diol of the 3' terminal ribose, the oxidation of the 2', 3'-cis diol of the ribose linked to the cap of 
the 5' ends of the mRNAs into a dialdehyde, and the coupling of the such obtained dialdehyde to 
a derivatized oligonucleotide tag. Further detail regarding the chemical approaches for 
obtaining mRNAs having intact 5' ends are disclosed in International Application No. 
W096/34981 published November 7, 1996. 

2 5 The enzymatic approach for ligating the oligonucleotide tag to the intact 5' ends of 

mRNAs involves the removal of the phosphate groups present on the 5' ends of uncapped 
incomplete mRNAs, the subsequent decapping of mRNAs having intact 5' ends and the ligation 
of the phosphate present at the 5 ' end of the decapped mRNA to an oligonucleotide tag. Further 
detail regarding the enzymatic approaches for obtaining mRNAs having intact 5' ends are 

3 0 disclosed in Dumas Milne Edwards J.B. (Doctoral Thesis of Paris VI University, Le clonage des 

ADNc complets: difficultes et perspectives nouvelles. Apports pour I'etude de la regulation de 
I'expression de la tryptophane hydroxylase de rat, 20 Dec. 1993), EPO 625572 and Kato et al, 
Gene 150:243-250 (1994). 
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In both the chemical and the enzymatic approach, the oligonucleotide tag has a 
restriction enzyme site (e.g. an EcoRI site) therein to facilitate later cloning procedures. 
Following attachment of the oligonucleotide tag to the mRNA, the integrity of the mRNA is 
examined by performing a Northern blot using a probe complementary to the oligonucleotide 
5 tag. 

For the mRNAs joined to oligonucleotide tags using either the chemical or the enzymatic 
method, first strand cDNA synthesis is performed using an oligo-dT primer with reverse 
transcriptase. This oligo-dT primer can contain an internal tag of at least 4 nucleotides, which 
can be different from one mRNA preparation to another. Methylated dCTP is used for cDNA 
1 0 first strand synthesis to protect the internal EcoRI sites from digestion during subsequent steps. 
The first strand cDNA is precipitated using isopropanol after removal of RNA by alkaline 
hydrolysis to eliminate residual primers. 

Second strand cDNA synthesis is conducted using a DNA polymerase, such as Klenow 
fragment and a primer corresponding to the 5' end of the ligated oligonucleotide. The primer is 
1 5 typically 20-25 bases in length. Methylated dCTP is used for second strand synthesis in order to 
protect internal EcoRI sites in the cDNA from digestion during the cloning process. 

Following second strand synthesis, the full-length cDNAs are cloned into a phagemid 
vector, such as pBlueScript™ (Stratagene). The ends of the full-length cDNAs are blunted with 
T4 DNA polymerase (Biolabs) and the cDNA is digested with EcoRI. Since methylated dCTP 
20 is used during cDNA synthesis, the EcoRI site present in the tag is the only hemi-methylated 
site; hence the only site susceptible to EcoRI digestion. In some instances, to facilitate 
subcloning, an Hind III adapter is added to the 3' end of full-length cDNAs. 

The full-length cDNAs are then size fractionated using either exclusion chromatography 
(AcA, Biosepra) or electrophoretic separation which yields 3 to 6 different fractions. The full- 

2 5 length cDNAs are then directionally cloned either into pBlueScript™ using either the EcoRI and 

Smal restriction sites or, when the Hind III adapter is present in the full-length cDNAs, the 
EcoRI and Hind III restriction sites. The ligation mixture is transformed, preferably by 
electroporation, into bacteria, which are then propagated under appropriate antibiotic selection. 
Clones containing the oligonucleotide tag attached to full-length cDNAs are selected as 

3 0 follows. 

The plasmid cDNA libraries made as described above are purified (e.g. by a column 
available from Qiagen). A positive selection of the tagged clones is performed as follows. 
Briefly, in this selection procedure, the plasmid DNA is converted to single stranded DNA using 
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phage Fl gene II endonuclease in combination with an exonuclease (Chang et al, Gene 127 :95 
(1993)) such as exonuclease III or T7 gene 6 exonuclease. The resulting single stranded DNA is 
then purified using paramagnetic beads as described by Fry et al., Biotechniques 13: 124 (1992). 
Here the single stranded DNA is hybridized with a biotinylated oligonucleotide having a 
5 sequence corresponding to the 3' end of the oligonucleotide tag. Preferably, the primer has a 
length of 20-25 bases. Clones including a sequence complementary to the biotinylated 
oligonucleotide are selected by incubation with streptavidin coated magnetic beads followed by 
magnetic capture. After capture of the positive clones, the plasmid DNA is released from the 
magnetic beads and converted into double stranded DNA using a DNA polymerase such as 
1 0 ThermoSequenase"^" (obtained from Amersham Pharmacia Biotech). Alternatively, protocols 
such as the Gene Trapper^'^ kit (Gibco BRL) can be used. The double stranded DNA is then 
transformed, preferably by electroporation, into bacteria. The percentage of positive clones 
having the 5' tag oligonucleotide is typically estimated to be between 90 and 98% from dot blot 
analysis. 

1 5 Following transformation, the libraries are ordered in microtiter plates and sequenced. 

ThQ Arabidopsis library was deposited at the American Type Culture Collection on January 
7, 2000 as "E-coli liba 010600" under the accession number PTA-1161 . 
EXAMPLE 2: SOUTHERN HYBRIDIZATIONS 

The SDFs of the invention can be used in Southern hybridizations as described above. 

2 0 The following describes extraction of DNA from nuclei of plant cells, digestion of the 

nuclear DNA and separation by length, transfer of the separated fragments to membranes, 
preparation of probes for hybridization, hybridization and detection of the hybridized probe. 

The procedures described herein can be used to isolate related polynucleotides or for 
diagnostic purposes. Moderate stringency hybridization conditions, as defined above, are 
25 described in the present example. These conditions result in detection of hybridization 
between sequences having at least 70% sequence identity. As described above, the 
hybridization and wash conditions can be changed to reflect the desired percenatge of 
sequence identity between probe and target sequences that can be detected. 

In the following procedure, a probe for hybridization is produced from two PCR 

3 0 reactions using two primers from genomic sequence of Arabidopsis thaliana. As described 

above, the particular template for generating the probe can be any desired template. 

The first PCR product is assessed to validate the size of the primer to assure it is of 
the expected size. Then the product of the first PCR is used as a template, with the same pair 
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of primers used in the first PCR, in a second PCR that produces a labeled product used as the 
probe. 

Fragments detected by hybridization, or other bands of interest, can be isolated from 
gels used to separate genomic DNA fragments by known methods for further purification 
5 and/or characterization. 



Buffers for nuclear DNA extraction 

1. lOXHB 





1000 ml 




40 mM spermidine 


10.2 g 


Spermine (Sigma S-2876) and spermidine (Sigma 
S-2501) 


10 mM spermine 


3.5 g 


Stabilize chromatin and the nuclear membrane 


0.1 M EDTA 
(disodium) 


37.2 g 


EDTA inhibits nuclease 


0.1 M Tris 


12.1 g 


Buffer 


0.8 M KCl 


59.6 g 


Adjusts ionic strength for stability of nuclei 



Adjust pH to 9.5 with 10 N NaOH. It appears that there is a nuclease present in 
leaves. Use of pH 9.5 appears to inactivate this nuclease. 

10 2. 2 M sucrose (684 g per 1000 ml) 

Heat about half the final volume of water to about 50 C. Add the sucrose slowly then 
bring the mixture to close to final volume; stir constantly until it has dissolved. Bring 
the solution to volume. 

3. Sarkosyl solution (lyses nuclear membranes) 



15 



1000 ml 
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N-lauroyl sarcosine (Sarkosyl) 
0.1 MTris 

0.04 M EDTA (Disodium) 



20.0 g 

12.1 g 



Adjust the pH to 9.5 after all the components are dissolved and bring up to the proper 
volume. 

4. 20% Triton X-100 
80 ml Triton X-100 
320 ml IxHB (w/o (3 -ME and PMSF) 
Prepare in advance; Triton takes some time to dissolve 

A. Procedure 

1. Prepare IX "H" buffer (keep ice-cold during use) 



lOX HB 
2 M sucrose 
Water 



1000 ml 
100 ml 

250 ml a non-ionic osmoticum 
634 ml 



Added just before use: 



100 mM PMSF* 



)-mercaptoethanol 



10 ml a protease inhibitor; protects 
nuclear membrane proteins 
1 ml inactivates nuclease by reducing 
disulfide bonds 



*100 mM PMSF 

(phenyl methyl sulfonyl fluoride, Sigma P-7626) 
(add 0.0875 g to 5 ml 100% ethanol) 



2. Homogenize the tissue in a blender (use 300-400 ml of IxHB per blender). Be sure 
2 5 that you use 5-10 ml of HB buffer per gram of tissue. Blenders generate heat so be 
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sure to keep the homogenate cold. It is necessary to put the blenders in ice 
periodically. 

Add the 20% Triton X-100 (25 ml per liter of homogenate) and gently stir on ice for 
20 min. This lyses plastid, but not nuclear, membranes. 

Filter the tissue suspension through several nylon filters into an ice-cold beaker. The 
first filtration is through a 250-micron membrane; the second is through an 85-micron 
membrane; the third is through a 50-micron membrane; and the fourth is through a 
20-micron membrane. Use a large funnel to hold the filters. Filtration can be sped up 
by gently squeezing the liquid through the filters. 

Centrifuge the filtrate at 1200 x g for 20 min. at 4°C to pellet the nuclei. 

Discard the dark green supernatant. The pellet will have several layers to it. One is 
starch; it is white and gritty. The nuclei are gray and soft. In the early steps, there 
may be a dark green and somewhat viscous layer of chloroplasts. 

Wash the pellets in about 25 ml cold H buffer (with Triton X-100) and resuspend by 
swirling gently and pipetting. After the pellets are resuspended. 

Pellet the nuclei again at 1200 - 1300 x g. Discard the supernatant. 

Repeat the wash 3-4 times until the supernatant has changed from a dark green to a 
pale green. This usually happens after 3 or 4 resuspensions. At this point, the pellet 
is typically grayish white and very slippery. The Triton X-100 in these repeated steps 
helps to destroy the chloroplasts and mitochondria that contaminate the prep. 

Resuspend the nuclei for a final time in a total of 15 ml of H buffer and transfer the 
suspension to a sterile 125 ml Erlenmeyer flask. 

Add 15 ml, dropwise, cold 2% Sarkosyl, 0.1 M Tris, 0.04 M EDTA solution (pH 9.5) 
while swirling gently. This lyses the nuclei. The solution will become very viscous. 
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8. Add 30 grams of CsCl and gently swirl at room temperature until the CsCl is in 
solution. The mixture will be gray, white and viscous. 

9. Centrifuge the solution at 11,400 x g at 4-C for at least 30 min. The longer this spin 
is, the firmer the protein pellicle. 

10. The result is typically a clear green supernatant over a white pellet, and (perhaps) 
under a protein pellicle. Carefully remove the solution under the protein pellicle and 
above the pellet. Determine the density of the solution by weighing 1 ml of solution 
and add CsCl if necessary to bring to 1.57 g/ml. The solution contains dissolved 
solids (sucrose etc) and the refractive index alone will not be an accurate guide to 
CsCl concentration. 

11. Add 20 III of 10 mg/ml EtBr per ml of solution. 

12. Centrifuge at 184,000 x g for 16 to 20 hours in a fixed-angle rotor. 

13. Remove the dark red supernatant that is at the top of the tube with a plastic transfer 
pipette and discard. Carefully remove the DNA band with another transfer pipette. 
The DNA band is usually visible in room light; otherwise, use a long wave UV light 
to locate the band. 

14. Extract the ethidium bromide with isopropanol saturated with water and salt. Once 
the solution is clear, extract at least two more times to ensure that all of the EtBr is 
gone. Be very gentle, as it is very easy to shear the DNA at this step. This extraction 
may take a while because the DNA solution tends to be very viscous. If the solution 
is too viscous, dilute it with TE. 

15. Dialyze the DNA for at least two days against several changes (at least three times) of 
TE (10 mM Tris, ImM EDTA, pH 8) to remove the cesium chloride. 
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16. Remove the dialyzed DNA from the tubing. If the dialyzed DNA solution contains a 
lot of debris, centrifuge the DNA solution at least at 2500 x g for 10 min. and 
carefully transfer the clear supernatant to a new tube. Read the A260 concentration of 
the DNA. 



5 17. Assess the quality of the DNA by agarose gel electrophoresis (1% agarose gel) of the 
DNA. Load 50 ng and 100 ng (based on the OD reading) and compare it with known 
and good quality DNA. Undigested lambda DNA and a lambda-Hindlll-digested 
DNA are good molecular weight makers. 



Protocol for Digestion of Genomic DNA 

Protocol : 

10 1. The relative amounts of DNA for different crop plants that provide approximately a 
balanced number of genome equivalent is given in Table 3. Note that due to the size 
of the wheat genome, wheat DNA will be underrepresented. Lambda DNA provides 
a useful control for complete digestion. 



2. Precipitate the DNA by adding 3 volumes of 100% ethanol. Incubate at -20°C for at 
15 least two hours. Yeast DNA can be purchased and made up at the necessary 

concentration, therefore no precipitation is necessary for yeast DNA. 



3. Centrifuge the solution at 11,400 x g for 20 min. Decant the ethanol carefully (be 
careful not to disturb the pellet). Be sure that the residual ethanol is completely 
removed either by vacuum desiccation or by carefully wiping the sides of the tubes 
2 0 with a clean tissue. 



4. Resuspend the pellet in an appropriate volume of water. Be sure the pellet is fully 
resuspended before proceeding to the next step. This may take about 30 min. 



Add the appropriate volume of lOX reaction buffer provided by the manufacturer of 
the restriction enzyme to the resuspended DNA followed by the appropriate volume 
of enzymes. Be sure to mix it properly by slowly swirling the tubes. 
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6. Set-up the lambda digestion-control for each DNA that you are digesting. 

7. Incubate both the experimental and lambda digests overnight at 37C. Spin down 
condensation in a microfuge before proceeding. 



After digestion, add 2 !.il of loading dye (typically 0.25% bromophenol blue, 0.25% 
xylene cyanol in 15% Ficoll or 30% glycerol) to the lambda-control digests and load 
in 1% TPE-agarose gel (TPE is 90 mM Tris-phosphate, 2 mM EDTA, pH 8). If the 
lambda DNA in the lambda control digests are completely digested, proceed with the 
precipitation of the genomic DNA in the digests. 

Precipitate the digested DNA by adding 3 volumes of 100% ethanol and incubating in 
-20-C for at least 2 hours (preferably overnight). 

EXCEPTION: Arabidopsis and yeast DNA are digested in an appropriate volume; 
they don't have to be precipitated. 

Resuspend the DNA in an appropriate volume of TE (e.g., 22 \i\ x 50 blots =1100 
and an appropriate volume of lOX loading dye (e.g., 2.4 [il x 50 blots = 120 jxl). Be 
careful in pipetting the loading dye - it is viscous. Be sure you are pipetting the 
correct volume. 



Table 3 



Some guide points in digesting genomic DNA. 



Species 


Genome 

Size 


Size Relative to 
Arabidopsis 


Genome 
Equivalent to 2 
[xg Arabidopsis 
DNA 


Amount 
of DNA 
per blot 


Arabidopsis 


120 Mb 


IX 


IX 


2 \xg 


Brassica 


1,100 Mb 


9.2X 


0.54X 


10 ^g 


Corn 


2,800 Mb 


23. 3X 


0.43X 


20 ^ig 
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Cotton 


Z,ivu Mb 


19. 2X 


0.52X 


20 iig 


Oat 


1 1 inn A^K 


94X 


O.llX 


20 


Rice 


4UU MD 


3.3X 


0.75X 


5 [ig 


Soybean 


l,iUU MD 


9.2X 


0.54X 


10 ^g 


Sugarbeet 


/ JO iviu 


6 3X 


0 8X 


10 \ig 


Sweetclover 


1,100 Mb 


9.2X 


0.54X 


lOfxg 


Wheat 


16,000 Mb 


133X 


0.08X 


20 ^ig 


Yeast 


15 Mb 


0.12X 


IX 


0.25 ^ig 



Protocol for Southern Blot Analysis 

The digested DNA samples are electrophoresed in 1% agarose gels in Ix TPE buffer. 
Low voltage; overnight separations are preferred. The gels are stained with EtBr and 
5 photographed. 



1. For blotting the gels, first incubate the gel in 0.25 N HCl (with gentle shaking) for 
about 15 min. 



2. Then briefly rinse with water. The DNA is denatured by 2 incubations. Incubate 
(with shaking) in 0.5 M NaOH in 1.5 M NaCl for 15 min. 

3. The gel is then briefly rinsed in water and neutralized by incubating twice (with 
shaking) in 1.5 M Tris pH 7.5 in 1.5 M NaCl for 15 min. 

4. A nylon membrane is prepared by soaking it in water for at least 5 min, then in 6X 
SSC for at least 15 min. before use. (20x SSC is 175.3 g NaCl, 88.2 g sodium citrate 
per liter, adjusted to pH 7.0.) 

5. The nylon membrane is placed on top of the gel and all bubbles in between are 
removed. The DNA is blotted from the gel to the membrane using an absorbent 
medium, such as paper toweling and 6x SCC buffer. After the transfer, the membrane 
may be lightly brushed with a gloved hand to remove any agarose sticking to the 
surface. 
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6. The DNA is then fixed to the membrane by UV crosslinking and baking at 80"C. The 
membrane is stored at 4-C until use. 

B. Protocol for PCR Amplification of Genomic Fragments in Arabidopsis 

Amplification procedures : 



1. Mix the following in a 0.20 ml PCR tube or 96-well PCR plate: 



Volume 


Stock 


Final Amount or Cone. 


0.5 ^il 


~ 10 ng/iil genomic DNA^ 


5 ng 


2.5 ^1 


lOX PCR buffer 


20 mM Tris, 50 mM KCl 


0.75 \x\ 


50 mM MgCb 


1.5 mM 


llil 


10 pmol/^d Primer 1 (Forward) 


10 pmol 


1 ^il 


10 pmol/^1 Primer 2 (Reverse) 


10 pmol 


0.5 jxl 


5 mM dNTPs 


0.1 mM 


0.1 ^il 


5 units/^d Platinum Taq™ (Life 
Technologies, Gaithersburg, MD) 
DNA Polymerase 


1 units 


(to 25 Lil) 


Water 





2. The template DNA is amplified using a Perkin Elmer 9700 PCR machine: 



' Arabidopsis DNA is used in the present experiment, but the procedure is a general one. 
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1) 94'C for 10 min. followed by 



2} 

5 cycles: 


3} 

5 cycles: 


4} 

25 cycles: 


94 °C- 30 sec 
62 '^C - 30 sec 
72 "C - 3 min 


94 °C - 30 sec 
58 °C- 30 sec 
72 "C- 3 min 


94 °C - 30 sec 
53 °C - 30 sec 
72 °C - 3 min 



5) 72 C for 7 min. Then the reactions are stopped by chilling to 4 C. 
The procedure can be adapted to a multi-well format if necessary. 
Quantification and Dilution of PCR Products: 

1. The product of the PCR is analyzed by electrophoresis in a 1% agarose gel. A 
linearized plasmid DNA can be used as a quantification standard (usually at 50, 100, 
200, and 400 ng). These will be used as references to approximate the amount of 
PCR products. Hindlll-digested Lambda DNA is useful as a molecular weight 
marker. The gel can be run fairly quickly; e.g., at 100 volts. The standard gel is 
examined to determine that the size of the PCR products is consistent with the 
expected size and if there are significant extra bands or smeary products in the PCR 
reactions. 

2. The amounts of PCR products can be estimated on the basis of the plasmid standard. 

3. For the small number of reactions that produce extraneous bands, a small amount of 
DNA from bands with the correct size can be isolated by dipping a sterile 10-^1 tip 
into the band while viewing though a UV Transilluminator. The small amount of 
agarose gel (with the DNA fragment) is used in the labeling reaction. 

C. Protocol for PCR-DIG-Labeling of DNA 

Solutions : 
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Reagents in PGR reactions (diluted PGR products, lOX PGR Buffer, 50 mM MgGb, 5 
U/^d Platinum Taq Polymerase, and the primers) 

lOX dNTP + DIG-ll-dUTP [1:5]: (2 mM dATP, 2 mM dGTP, 2 mM dGTP, 1.65 
mM dTTP, 0.35 mM DIG-ll-dUTP) 

lOX dNTP + DIG-ll-dUTP [1:10]: (2 mM dATP, 2 mM dCTP, 2 mM dGTP, 1.81 
mM dTTP, 0.19 mM DIG-ll-dUTP) 

lOX dNTP + DIG-ll-dUTP [1:15]: (2 mM dATP, 2 mM dCTP, 2 mM dGTP, 1.875 
mM dTTP, 0.125 mM DIG-ll-dUTP) 

TE buffer (10 mM Tris, 1 mM EDTA, pH 8) 

Maleate buffer: In 700 ml of deionized distilled water, dissolve 11.61 g maleic acid 
and 8.77 g NaCl. Add NaOH to adjust the pH to 7.5. Bring the volume to 1 L. Stir 
for 15 min. and sterilize. 

10% blocking solution: In 80 ml deionized distilled water, dissolve 1.1 6g maleic 
acid. Next, add NaOH to adjust the pH to 7.5. Add 10 g of the blocking reagent 
powder (Boehringer Mannheim, Indianapolis, IN, Cat. no. 1096176). Heat to 60°C 
while stirring to dissolve the powder. Adjust the volume to 100 ml with water. Stir 
and sterilize. 

1% blocking solution: Dilute the 10% stock to 1% using the maleate buffer. 

Buffer 3 (100 mM Tris, 100 mM NaCl, 50 mM MgCla, pH9.5). Prepared from 
autoclaved solutions of IM Tris pH 9.5, 5 M NaCl, and 1 M MgCli in autoclaved 
distilled water. 
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Procedure : 

1. PCR reactions are performed in 25 [il volumes containing: 



PCR buffer 
MgCb 

lOX dNTP + DIG-ll-dUTP 
Platinum Taq^" Polymerase 
10 pg probe DNA 
10 pmol primer 1 



IX 

1.5 mM 

IX (please see the note below) 
1 unit 



Note: 



IPX dNTP + DIG-ll-dUTP (1:5) 
lOX dNTP + DIG-ll-dUTP (1:10) 
lOX dNTP + DIG-ll-dUTP (1:15) 



Use for : 
< 1 kb 

1 kbto 1.8 kb 
> 1.8 kb 



2. The PCR reaction uses the following amplification cycles: 
1) 94°C for 10 min. 



2} 

5 cycles: 


3) 

5 cycles: 


4} 

25 cycles: 


95°C - 30 sec 
61°C - Imin 
73°C - 5 min 


95''C - 30 sec 
59°C - 1 min 
75°C - 5 min 


95°C - 30 sec 
51°C - 1 min 
73"C - 5 min 



5) 72"C for 8 min. The reactions are terminated by chilling to 4'c (hold). 

3. The products are analyzed by electrophoresis- in a 1% agarose gel, comparing to an 
aliquot of the unlabelled probe starting material. 



The amount of DIG-labeled probe is determined as follows: 
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Make serial dilutions of the diluted control DNA in dilution buffer (TE: 10 mM Tris 
and 1 mM EDTA, pH 8) as shown in the following table: 



DIG-labeled control 
DNA starting cone. 


Stepwise Dilution 


Final Cone. (Dilution 

Name) 


5 ng/|j,l 


1 . .1 Ad . . 1 T^ir 
1 [.ll in 4y [11 it, 


100 pg/^1 (A) 


100pg/^l(A) 


25 jxl in 25 \i\ TE 


50 pg/^U (B) 


50 pg/jxl (B) 


25 III in 25 [.d TE 


25 pg/^1 (C) 


25 pg/^il (C) 


20 \il in 30 [Al TE 


10 pg/\.i] (D) 



a. Serial deletions of a DIG-labeled standard DNA ranging from 100 pg to 10 pg 
are spotted onto a positively charged nylon membrane, marking the membrane 
lightly with a pencil to identify each dilution. 

b. Serial dilutions (e.g., 1:50, 1:2500, 1:10,000) of the newly labeled DNA probe 
are spotted. 

c. The membrane is fixed by UV crosslinking. 

d. The membrane is wetted with a small amount of maleate buffer and then 
incubated in 1% blocking solution for 15 min at room temp. 

e. The labeled DNA is then detected using alkaline phosphatase conjugated anti- 
DIG antibody (Boehringer Mannheim, Indianapolis, IN, cat. no. 1093274) and 
an NBT substrate according to the manufacture's instruction. 

f. Spot intensities of the control and experimental dilutions are then compared to 
estimate the concentration of the PCR-DIG-labeled probe. 
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D. Prehybridization and Hybridization of Southern Blots 

Solutions : 

100% Formamide purchased from Gibco 

20X SSC (IX = 0.15 M NaCl, 0.015 M Na3citrate) 

per L: 175 g NaCl 

87.5 g Na3citrate-2H20 

20% Sarkosyl (N-lauroyl-sarcosine) 

20% SDS (sodium dodecyl sulphate) 

10% Blocking Reagent: In 80 ml deionized distilled water, dissolve 1.16 g maleic 

acid. Next, add NaOH to adjust the pH to 7.5. Add 10 g of the blocking reagent 
powder. Heat to 60-C while stirring to dissolve the powder. Adjust the volume 
to 100 ml with water. Stir and sterilize. 



Prehybridization Mix: 



Final 

Concentration 


Components 


Volume 
(per 100 ml) 


Stock 


50% 


Formamide 


50 ml 


100% 


5X 


SSC 


25 ml 


20X 


0.1% 


Sarkosyl 


0.5 ml 


20% 


0.02% 


SDS 


0.1 ml 


20% 


2% 


Blocking Reagent 


20 ml 


10% 




Water 


4.4 ml 





General Procedures : 

1. Place the blot in a heat-sealable plastic bag and add an appropriate volume of 

prehybridization solution (30 ml/lOOcm ) at room temperature. Seal the bag with a 
heat sealer, avoiding bubbles as much as possible. Lay down the bags in a large 
plastic tray (one tray can accommodate at least 4-5 bags). Ensure that the bags are 
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lying flat in the tray so that the prehybridization solution is evenly distributed 
throughout the bag. Incubate the blot for at least 2 hours with gentle agitation using a 
waver shaker. 

2. Denature DIG-labeled DNA probe by incubating for 10 min. at 98-C using the PGR 
machine and immediately cool it to 4-C. 

3. Add probe to prehybridization solution (25 ng/ml; 30 ml = 750 ng total probe) and 
mix well but avoid foaming. Bubbles may lead to background. 

4. Pour off the prehybridization solution from the hybridization bags and add new 
prehybridization and probe solution mixture to the bags containing the membrane. 

5. Incubate with gentle agitation for at least 16 hours. 

6. Proceed to medium stringency post-hybridization wash: 

Three times for 20 min. each with gentle agitation using IX SSC, 1% SDS at 60-C. 

All wash solutions must be prewarmed to 60° C. Use about 100 ml of wash solution 
per membrane. 

To avoid background keep the membranes fully submerged to avoid drying in spots; 
agitate sufficiently to avoid having membranes stick to one another. 

7. After the wash, proceed to immunological detection and CSPD development. 

E. Procedure for Immunological Detection with CSPD 

Solutions : 



Buffer 1: 



Maleic acid buffer (0.1 M maleic acid, 0.15 M NaCl; 
adjusted to pH 7.5 with NaoH) 



Reference No. 2750-942P 



Washing buffer: 



864 

Maleic acid buffer with 0.3% (v/v) Tween 20. 



Blocking stock solution 



10% blocking reagent in buffer 1. Dissolve (lOX 
concentration): blocking reagent powder (Boehringer 
Mannheim, Indianapolis, IN, cat. no. 1096176) by 
constantly stirring on a 65-C heating block or heat in a 
microwave, autoclave and store at 4 C. 



Buffer 2 



(IX blocking solution): 



Dilute the stock solution 1:10 in Buffer 1. 



Detection buffer: 



0.1 M Tris, 0.1 M NaCI, pH 9.5 



Procedure : 



1. 

2. 

3. 

15 

4. 

5. 

20 6. 
7. 



After the post-hybridization wash the blots are briefly rinsed (1-5 min.) in the maleate 
washing buffer with gentle shaking. 

Then the membranes are incubated for 30 min. in Buffer 2 with gentle shaking. 

Anti-DIG-AP conjugate (Boehringer Mannheim, Indianapolis, IN, cat. no. 1093274) 
at 75 mU/ml (1:10,000) in Buffer 2 is used for detection. 75 ml of solution can be 
used for 3 blots. 

The membrane is incubated for 30 min. in the antibody solution with gentle shaking. 

The membrane are washed twice in washing buffer with gentle shaking. About 250 
mis is used per wash for 3 blots. 

The blots are equilibrated for 2-5 min in 60 ml detection buffer. 

Dilute CSPD (1:200) in detection buffer. (This can be prepared ahead of time and 
stored in the dark at 4 C). 
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The following steps must be done individually. Bags (one for detection and one for 
exposure) are generally cut and ready before doing the following steps. 

8. The blot is carefully removed from the detection buffer and excess liquid removed 

without drying the membrane. The blot is immediately placed in a bag and 1.5 ml of 
CSPD solution is added. The CSPD solution can be spread over the membrane. 
Bubbles present at the edge and on the surface of the blot are typically removed by 
gentle rubbing. The membrane is incubated for 5 min. in CSPD solution. 

Excess liquid is removed and the membrane is blotted briefly (DNA side up) on 
Whatman 3MM paper. Do not let the membrane dry completely. 

Seal the damp membrane in a hybridization bag and incubate for 10 min at 37"C to 
enhance the luminescent reaction. 

Expose for 2 hours at room temperature to X-ray film. Multiple exposures can be 
taken. Luminescence continues for at least 24 hours and signal intensity increases 
during the first hours. 

15 Example 3: Transformation of Carrot Cells 

Transformation of plant cells can be accomplished by a number of methods, as 
described above. Similarly, a number of plant genera can be regenerated from tissue culture 
following transformation. Transformation and regeneration of carrot cells as described herein 
is illustrative. 

2 0 Single cell suspension cultures of carrot (Daucus carotd) cells are established from 

hypocotyls of cultivar Early Nantes in B5 growth medium (O.L. Gamborg et al., Plant 
Physiol. 45:372 (1970)) plus 2,4-D and 15 mM CaCl2(B5 -44 medium) by methods known in 
the art. The suspension cultures are subcultured by adding 10 ml of the suspension culture to 
40 ml of B5-44 medium in 250 ml flasks every 7 days and are maintained in a shaker at 150 

2 5 rpm at 27 "C in the dark. 

The suspension culture cells are transformed with exogenous DNA as described by Z. 
Chen et al. Plant Mol. Bio. 36:163 (1998). Briefly, 4-days post-subculture cells are incubated 
with cell wall digestion solution containing 0.4 M sorbitol, 2% driselase, 5mM MES (2-[N- 



9. 



10 10. 
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Morpholino] ethanesulfonic acid) pH 5.0 for 5 hours. The digested cells are pelleted gently 
at 60 xg for 5 min. and washed twice in W5 solution containing 154 mM NaCl, 5 mM KCl, 
125 mM CaCli and 5mM glucose, pH 6.0. The protoplasts are suspended in MC solution 
containing 5 mM MES, 20 mM CaCh, 0.5 M mannitol, pH 5.7 and the protoplast density is 
5 adjusted to about 4 x lO*' protoplasts per ml. 

15-60 ^g of plasmid DNA is mixed with 0.9 ml of protoplasts. The resulting 
suspension is mixed with 40% polyethylene glycol (MW 8000, PEG 8000), by gentle 
inversion a few times at room temperature for 5 to 25 min. Protoplast culture medium known 
in the art is added into the PEG-DNA-protoplast mixture. Protoplasts are incubated in the 

1 0 culture medium for 24 hour to 5 days and cell extracts can be used for assay of transient 

expression of the introduced gene. Alternatively, transformed cells can be used to produce 
transgenic callus, which in turn can be used to produce transgenic plants, by methods known 
in the art. See, for example, Nomura and Komamine, Pit. Phys. 79:988-991 (1985), 
Identification and Isolation of Single Cells that Produce Somatic Embryos in Carrot 

1 5 Suspension Cultures. 

An additional deposit, PTA-1411, of an E. coli Library, E. co/iLibA021800, was 
made at the American Type Culture Collection in Manassas, Virginia, USA on February 22, 
2000 to meet the requirements of Budapest Treaty for the international recognition of the 
deposit of microorganisms. This deposit was assigned ATCC accession no. PTA-1411. 

2 0 The invention being thus described, it will be apparent to one of ordinary skill in the 

art that various modifications of the materials and methods for practicing the invention can be 
made. Such modifications are to be considered within the scope of the invention as defined 
by the following claims. 

Each of the references from the patent and periodical literature cited herein is hereby 

2 5 expressly incorporated in its entirety by such citation. 
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TABLE 1 

>1297184 

len = 

Term 
Intr 
Init 



10090 9717 
10506 10184 
11137 10900 



>1297184 
len = 



1470 nex ■■ 



Term 
Intr 
Intr 
Init 

>1297184 

len = 

Init 
Intr 
Term 



14341 13880 

14529 14477 

14673 14624 

15349 15056 

/40037 



16472 16883 
17095 17382 
17730 18206 



>1297184 
len = 



nex ■■ 



Init 
Intr 
Intr 
Intr 
Intr 
Intr 
Intr 
Intr 
Intr 
Intr 
Term 



23715 
24275 
24477 
24641 
24949 
25275 
25618 
25852 
26008 
26239 
26416 



23788 
24361 
24554 
24834 
25090 
25355 
25746 
25929 
26079 
26319 
26618 



>1402874 
len = 



1171 nex 



Init 
Intr 
Intr 
Term 

>1402874 

len = 



65717 66071 

66169 66290 

66363 66515 

66600 66887 

/16813 



60 



Init 
Intr 
Term 



78870 78982 
79066 79242 
79311 79622 
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>1402874 
len = 



Init 
Intr 
Term 

>1532162 

len = 



80005 80250 
80334 80486 
80573 80854 



1353 



Init 
Intr 
Intr 
Term 



10117 10395 

10519 10718 

10809 11038 

11112 11469 



>1532162 
len = 



/156172 



1286 nex 



Init 
Intr 
Intr 
Term 



10232 10395 

10519 10718 

10809 11038 

11112 11517 



>1532162 
len = 



/1415 
64 9 nex = 



Init 
Term 



>1532162 
len = 



11955 12231 
12334 12603 



732937 
738 nex 



Init 
Term 



11982 12231 
12334 12719 



>1532162 
len = 



22 3 0 nex 



Init 
Intr 
Intr 
Intr 
Intr 
Intr 
Intr 
Intr 
Term 



21905 
22473 
22691 
22971 
23189 
23415 
23568 
23758 
23900 



22035 
22603 
22865 
23036 
23323 
23494 
23670 
23816 
24131 



>1532162 
len = 



/20800 
2174 nex = 



Init 22161 22221 
60 Intr 22473 22603 
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Intr 
Intr 
Intr 
Intr 

5 Intr 
Intr 
Term 

>1532162 

10 

len = 

init 
Intr 

15 Intr 
Intr 
Intr 
Intr 
Intr 

2 0 Intr 
Term 

>1532162 

2 5 len = 

Sngl 

>1532162 

30 

len = 
Sngl 

35 >1532162 
len = 
Sngl 

40 

>1532162 
len = 
4 5 Sngl 
>1532162 
len = 

50 

Sngl 
>1532162 
55 len = 

Sngl 



22691 22865 

22971 23036 

23189 23323 

23415 23494 

23568 23670 

23758 23816 

23900 24131 

733957 

2175 nex = 

22161 22221 

22473 22603 

22691 22865 

22971 23036 

23189 23323 

23415 23494 

23568 23670 

23758 23816 

23900 24132 

/154048 

638 nex = 

28309 28946 

/15529 

2414 nex = 

40592 41742 

/39051 

212 7 nex = 

45263 44433 

/15968 

191 nex = 

47765 47575 

/29991 

2374 nex = 

48703 47562 

/6135 

237 0 nex = 

48703 47566 



>1532162 

60 



/26043 
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Term 
Intr 
Init 



50762 50243 
51173 51108 
51483 51286 



>1532162 
len = 



/14942 
1966 nex 



Init 
Intr 
Term 



52773 52831 
52920 53095 
53208 53601 



>1532162 
len = 



Init 
Intr 
Term 



52773 52831 
52920 53095 
53208 53533 



>1532162 

len = 

Term 
Intr 
Intr 
Init 



1675 nex 



58063 
58192 
58514 
59394 



57720 
58151 
58430 
58958 



>1707006 
len = 



/26007 
359 nex ■■ 



Init 
Term 



22636 22811 
22887 22994 



>1707006 
len = 



2124 nex 



Init 
Intr 
Intr 
Intr 
Intr 
Term 

>1707006 

len = 



22665 22811 

22887 22993 

23509 23571 

23932 23963 

24280 24374 

24494 24773 



/40748 



3502 



nex 



Init 
Intr 
Intr 
Intr 
Intr 
Intr 
Term 



50053 
50378 
51366 
51531 
51655 
51858 
52003 



50271 
50467 
51425 
51570 
51748 
51917 
52094 
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>1707006 

len = 

Term 
Init 

>1707006 
len = 
Sngl 

>1707006 

len = 

Term 
Init 

>1707006 

len = 

Init 
Term 

>1707006 

len = 

Term 
Intr 
Init 

>1707006 

len = 

Term 
Intr 
Init 

>1707005 

len = 

Term 
Intr 
Init 

>1707006 

len = 

Init 
Intr 
Intr 
Intr 



/125567 

1097 nex = 

54221 53867 
54963 54743 

/152227 

430 nex = 

55477 55054 

/38063 

1598 nex = 

54226 53885 
55482 54743 

/10375 

744 nex = 

58320 58692 
58784 59063 

/10617 

815 nex = 

62856 62577 
63141 62943 
63391 63248 

/1711 

801 nex = 

62856 62591 
63141 62943 
63391 63248 

/29818 

760 nex = 

62856 62632 
63141 62943 
63391 63248 

/40627 

2499 nex = 

67998 68163 

68353 68506 

68592 68875 

68968 69064 
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intr 
Intr 
Intr 
Intr 
Intr 
Term 



69314 69419 

69514 69596 

69689 69834 

69966 70071 

70216 70275 

70361 70496 



>1707006 
len = 
Sngl 
>1785673 
len = 
Sngl 
>1871173 
len = 



/101081 
976 nex = 
75101 74126 
723693 
622 nex = 
31275 31896 
/38610 
2054 nex = 



Init 
Intr 
Intr 
Intr 
Intr 
Intr 
Intr 
Term 

>1871173 

len = 



12146 
12284 
12573 
12790 
12970 
13186 
13424 
13584 



12237 
12483 
12704 
12866 
13105 
13326 
13482 
13650 



Init 



49483 49183 
50220 49576 



>1871173 
len = 



3870 



nex = 



Init 
Intr 
Intr 
Intr 
Intr 
Intr 
Term 



50368 
50630 
51247 
51370 
51546 
52100 
54023 



50540 
50680 
51298 
51428 
51591 
52233 
54234 



/96448 
822 nex 



Init 
Intr 
Intr 
Term 



105150 105228 

105315 105428 

105500 105535 

105630 105953 
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>1877523 
len = 
Sngl 

>1877523 

len = 

Term 
Init 

>1877523 

len = 

Term 
Intr 
Intr 
Intr 
Intr 
Intr 
Intr 
Init 

>1877523 

len = 

Term 
Intr 
Init 

>1877523 
len = 
Sngl 

>1877523 

len = 

Init 
Term 

>1931636 

len = 

Init 
Intr 
Term 

>1931536 

len = 

Term 



/2677 
670 nex = 
21255 20592 

/1693 

710 nex = 

21139 20646 
21355 21247 

/40042 

2369 nex = 



38060 
38794 
38997 
39148 
39328 
39526 
39689 
40034 



37972 
38656 
38927 
39104 
39239 
39438 
39614 
39798 



735733 

1245 nex = 

53914 53583 
54335 54159 
54827 54574 

/34291 
809 nex = 
61137 61281 

72979 

76 6 nex = 

61146 61281 
61565 61911 

793598 

1589 nex = 

111855 112746 
112845 112949 
113156 113443 

740765 

1821 nex = 

50015 49475 
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Intr 
Init 

>1931636 

len = 

Sngl 

>1931636 

Len = 

Sngl 

>1946354 

len = 

Term 
Intr 
Intr 
Intr 
Intr 
Intr 
Intr 
Intr 
Intr 
Intr 
Init 

>1946354 

len = 

Init 
Term 

>1946354 

len = 

Init 
Term 

>1946354 

len = 

Sngl 

>1946354 

len = 

Sngl 

>1946354 

len = 



50253 50130 
51295 50557 

/20637 

644 nex = 

63596 62953 

/14648 

503 nex = 

97733 97231 

/1391 

4584 nex = 



12119 
12281 
12535 
12756 
13005 
13304 
13613 
13994 
14593 
15009 
15456 



11739 
12213 
12455 
12682 
12873 
13257 
13401 
13833 
14363 
14680 
15157 



/7619 

939 nex = 

31875 32384 
32537 32813 

734999 

1078 nex = 

33182 33416 
33743 34259 

/39560 

6 74 nex = 

41592 42265 

/41046 

73 0 nex = 

57609 58323 

71820 

1190 nex = 
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Term 
init 

>1946354 

len = 

Init 
Term 

>2062153 

len = 

Term 
Intr 
Init 

>2062153 
Len = 
Sngl 

>2062153 

len = 

Term 
Intr 
Init 

>2062153 

len = 

Term 
Intr 
Init 

>2062153 

len = 

Term 
Init 

>2062153 

len = 

Term 
Intr 
Init 

>2062153 

len = 



7729 6909 
8098 7816 

/22671 

1583 nex = 

83167 83385 
83523 83614 

/38051 

14 91 nex = 

15272 14834 
15841 15386 
16324 16026 

/119458 

1513 nex = 

16220 16026 

/157474 

1497 nex = 

15272 14858 
15841 15386 
16220 16026 

/30056 

1520 nex = 

15272 14836 
15841 15386 
16220 16026 

/42777 

1450 nex = 

24390 23947 
25283 24512 

/6448 

1481 nex = 

24390 23947 
24955 24512 
25427 25053 

/12715 

1976 nex = 
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Term 
Intr 
Init 

>2062153 

len = 

Init 
Intr 
Term 

>2062153 
len = 
Sngl 

>2062153 

len = 

Init 
Intr 
Term 

>2088638 
len = 
Sngl 

>2088638 

len = 

Term 
Init 



55961 55118 
55262 56051 
57093 56580 

/30003 

2057 nex = 

7382 7843 
7929 8378 
8469 8866 

732293 

790 nex = 

69530 68750 

/29750 

2112 nex = 

76786 77284 
77663 77774 
77921 78394 

79398 

616 nex = 

103573 102958 

76732 

1632 nex = 

17390 16530 
18161 17822 



>2088638 

len = 

Init 
Intr 
Intr 
Intr 
Intr 
Intr 
Term 

>2088638 

len = 

Term 
Intr 
Init 



739048 
2533 nex 



24452 
25154 
25457 
25633 
25917 
26186 
26486 



24782 
25378 
25551 
25822 
26041 
26401 
26984 



733701 

1515 nex = 

32027 31685 
32312 32109 
32802 32388 



60 >2088638 



715207 



Reference No. 2750-942P 



len -■ 



2110 nex 



Term 
Intr 
Intr 
Init 

>2088638 

len = 

Term 
Intr 
Intr 
Intr 
Intr 
Intr 
Intr 
Intr 
Intr 
Init 

>2088638 
len = 
Sngl 

>2088638 

len = 

Init 
Intr 
Term 

>2088638 

len = 

Init 
Intr 
Term 

>2098816 

len = 

Sngl 

>2098816 

len = 

Sngl 

>2098816 



50426 50181 

50656 50514 

51540 51487 

52290 52051 

/5504 

2820 



52859 
53066 
53260 
53424 
53674 
53905 
54431 
54618 
54880 
55058 



nex = 

52504 
52943 
53159 
53356 
53567 
53851 
54301 
54544 
54803 
54973 



/35056 

1510 nex = 

70686 69178 

/32440 

919 nex = 

80756 80853 
81026 81170 
81258 81674 

/5046 

164 7 nex = 

95145 95511 
95860 96013 
96327 96791 

/31252 

704 nex = 

35121 35824 

715292 

1279 nex = 

39507 40333 

/36730 



6 0 len = 



2135 nex = 



9 
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Init 


43827 


44181 




0 




intr 


44267 


44314 




0 




Intr 


44406 


44582 


+ 


0 


5 


Intr 


44668 


44818 


+ 


0 




Intr 


44908 


44994 


+ 


0 




Intr 


45079 


45203 


+ 


0 




Intr 


45282 


45400 


+ 


0 




Intr 


45483 


45594 


+ 


0 


10 




45685 


45961 




0 




>2098816 


/8716 








len = 


1090 


nex = 


5 




15 














Init 


44941 


44994 




0 




Intr 


45079 


45203 


+ 


0 




Intr 


45282 


45400 


+ 


0 




Intr 


45483 


45594 


+ 


0 


20 




45685 


46007 


+ 


0 




>2098816 


/36216 








len = 


2338 


nex = 


6 




25 














Init 


58990 


59535 


+ 


0 




Intr 


59663 


59944 


+ 


0 




Intr 


60031 


60178 


+ 


0 






60282 


60367 




0 


30 


Intr 


60894 


60971 




0 




Term 


61070 


61327 




0 




>20988 16 


/42713 






35 


len = 


2280 


nex = 


6 






Init 


59046 


59535 


+ 


0 




Intr 


59663 


59944 


+ 


0 




Intr 


60031 


60178 


+ 


0 


40 


Intr 


60282 


60367 


+ 


0 




Intr 


60894 


60971 


+ 


0 




Term 


61070 


61325 


+ 


0 




>2098816 


736286 






45 














len 


643 


nex = 


1 






Sngl 


6052 


5410 


- 


0 


50 


>2098816 


/38820 








len - 


756 


nex = 


1 






Sngl 


6188 


5433 




0 


55 














>2098816 


/38170 








len = 


2445 


nex = 


6 




60 


Term 


63428 


62916 




0 
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Intr 


63750 


63522 


0 


Intr 


63933 


63894 


0 


Intr 


64507 


64381 


0 


Intr 


64935 


64803 


0 


Init 


65360 


65060 


0 



>2098816 
len = 
Sngl 

>2098815 

len = 

Term 
Init 

>2098816 

len = 

Sngl 

>2098816 

len = 

Sngl 

>2098816 

len = 

Init 
Intr 
Intr 
Intr 
Intr 
Term 

>2098816 

len = 

Term 
Intr 
Init 

>2104523 

len = 

Term 
Init 

>21Q4523 

len = 



/40254 

628 nex = 

69008 68744 

/17126 

811 nex = 

69008 68666 
69476 69328 

/122497 

359 nex = 

70110 69752 

736543 

1173 nex = 

77771 76602 

/17357 

2350 nex = 



88159 
88942 
89118 
89580 
90016 
90323 



88663 
89027 
89341 
89646 
90126 
90506 



/31770 

1183 nex = 

90853 90569 
91066 90933 
91751 91546 

/21952 

2710 nex = 

71964 70632 
73339 72021 

734676 

3970 nex = 
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Term 
Intr 
Intr 
5 Intr 
Intr 
Intr 
Intr 
Intr 

10 Intr 
Intr 
Intr 
Init 

15 >2160132 



Term 
Intr 
Intr 
Init 

>2160132 

len = 

Sngl 

>2160132 

len = 

Sngl 

>2160132 

len = 

Sngl 

>2160132 

len = 

Term 
Intr 
Intr 
Intr 
Intr 
Intr 
Intr 
Intr 
Intr 
Init 



76257 
76489 
77518 
77737 
77956 
78109 
78360 
78636 
78894 
79089 
79279 
79954 



75990 
76350 
77169 
77610 
77828 
78031 
78197 
78451 
78763 
78998 
79180 
79652 



/9002 

1994 nex = 

38308 37073 

38529 38400 

38751 38614 

39066 38884 

/18804 

1584 nex = 

60820 59237 

/21783 

415 nex = 

78298 78712 

/21416 

698 nex = 

79001 78304 

/15957 



2326 

88656 
88915 
89076 
89255 
89496 
89765 
89957 
90142 
90343 
90814 



nex == 

88489 
88769 
89020 
89214 
89416 
89687 
89886 
90042 
90314 
90417 



len = 

60 



2353 nex = 



10 
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Term 
Intr 
Intr 
Intr 
Intr 
Intr 
Intr 
Intr 
Intr 
Init 



88656 
88915 
89076 
89255 
89496 
89765 
89937 
90142 
90343 
90839 



88487 
88759 
89020 
89214 
89416 
89687 
89886 
90042 
90314 
90417 



3312 



nex 



Term 
Intr 
Intr 
Intr 
Intr 
Intr 
Init 



15073 
15288 
15853 
16067 
16314 
17071 
17685 



14667 
15145 
15737 
15930 
16190 
16986 
17547 



>2160155 
len = 



Init 
Term 



44387 45479 
45550 46030 



>2160155 
len = 



2477 



nex 



Term 
Intr 
Intr 
Intr 
Intr 
Intr 
Init 



52597 
52839 
53020 
53302 
54097 
54467 
54755 



52279 
52684 
52925 
53110 
53469 
54189 
54540 



>2160155 
len = 



3299 



nex : 



Init 
Intr 
Intr 
Intr 
Intr 
Intr 
Intr 
Intr 
Intr 
Intr 
Term 



60441 
61340 
61506 
61883 
62027 
62237 
62639 
62828 
63016 
63191 
63481 



60770 
61399 
61619 
61948 
62134 
62320 
62740 
62941 
63096 
63310 
63739 



>2160155 

60 



/17081 
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1390 nex 



Term 
Init 

>2160155 

len = 

Init 
Intr 
Intr 
Intr 
Intr 
Term 

>2160155 

len = 

Init 
Term 

>2160155 

len = 

Init 
Term 

>2160155 

len = 

Init 
Intr 
Term 

>2160155 

len = 

Term 
Init 

>2182285 

len = 

Init 
Intr 
Intr 
Term 

>2182285 

len = 



5724 5111 
6498 6293 

/39525 

2602 nex = 

7274 7423 

7512 7572 

7673 7725 

7845 7946 

8057 8546 

8659 9410 

76642 

823 nex = 

76276 76526 
76743 77098 

78575 

851 nex = 

76277 76526 
76743 77127 

734772 

1361 nex = 

7845 7946 
8057 8546 
8659 9205 

723319 

798 nex = 

86139 86067 
86864 86267 

721725 

2410 nex = 

10500 10780 

11596 11657 

12371 12411 

12536 12907 

72118 

508 nex = 



Init 33841 33970 
60 Term 34088 34262 
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>2182285 

len = 

Init 
Term 

>2182285 

len = 

Init 
Term 

>2ie2285 
len = 
Sngl 

>2182285 

len = 

Term 
Intr 
Intr 
Intr 
Init 

>2182285 
len = 

>2182285 

len = 

Term 
Intr 
Intr 
Init 

>2182286 

len = 

Term 
Init 

>2182286 

len = 

Term 
Intr 
Intr 
Intr 
Intr 



/25136 
67 4 nex 



36937 
37294 



37065 
37610 



/108302 
610 nex 



38504 
38787 



38637 
39106 



/1264 
819 nex = 
40950 40138 
727763 
2303 nex = 



51059 
51488 
51733 
52004 
52852 



50550 
51406 
51567 
51818 
52510 



/13186 

2050 nex = 

/27609 

957 nex = 

97439 97181 

97605 97537 

97796 97693 

98125 97940 

/37761 

1581 nex = 

12027 10760 
12340 12113 

734835 

2290 nex = 

20929 20543 

21150 21014 

21441 21240 

21635 21537 

22248 22162 



Reference No. 2750-942P 



Init 

>2182286 

len = 

Sngl 

>2182286 

len = 

Term 
Intr 
Intr 
Intr 
Init 

>2182286 
len = 
Sngl 

>2182287 

len = 

Init 
Intr 
Intr 
Intr 
Intr 
Intr 
Intr 
Intr 
Intr 
Intr 
Intr 
Intr 
Term 

>2182287 

len = 

Sngl 

>2182287 

len = 

Sngl 

>2182287 

len = 



22824 22577 
/15161 
33 6 nex = 
32955 32620 
/3538 
1090 nex = 



56957 
57090 
57291 
57436 
57657 



56571 
57034 
57192 
57378 
57525 



/31705 
1400 nex = 
61850 60451 

/13008 
2683 nex = 



15455 
15687 
15834 
15991 
16164 
16347 
16539 
16756 
17067 
17201 
17463 
17658 
17832 



15581 
15748 
15911 
16066 
16234 
16448 
16628 
16830 
17099 
17272 
17525 
17773 
18137 



/14016 
1166 nex = 
33340 33469 
/35042 
4 97 nex = 
33706 33927 
/20858 
1183 nex = 



Sngl 50135 49403 

60 



0 
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>2182287 



/2985 



len = 



1215 



nex = 



Init 66989 67367 
Term 67700 68203 



len = 1074 nex = 

Init 67134 67367 
Term 67700 68203 



len = 3085 nex = 

Init 70516 70728 

Intr 71073 71225 

Intr 71312 71410 

Intr 71569 71670 

Intr 71750 71893 

Intr 72009 72103 

Intr 72210 72339 

Intr 72415 72519 

Intr 72624 72730 

Intr 72806 72857 

Intr 72941 73030 

Intr 73120 73204 

Intr 73290 73354 

Term 73441 73600 

>2182287 /121535 

len = 153 nex = 

Sngl 88279 88127 

>2182287 /103034 

len = 910 nex = 

Term 97140 96927 

Intr 97606 97230 

Init 97824 97721 

>2182289 /27197 

len = 2831 nex = 

Term 14016 13844 

Intr 14182 14117 

Intr 14330 14282 

Intr 14588 14621 

Intr 14820 14775 

Intr 15017 14924 

Intr 15178 15116 

Intr 15465 15343 

Intr 15654 15621 



>2182287 



/20754 



>2182287 



/13385 
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Intr 
Init 

>2182289 

5 

len = 
Sngl 

10 >2182289 
len = 
Term 

15 Intr 
Intr 
Intr 
Intr 
Intr 

2 0 Init 

>2182289 
len = 

25 

Term 
Intr 
Init 

30 >2191125 
len = 
Init 

3 5 Intr 

Term 

>2191126 

4 0 len = 

Sngl 
>2191126 

45 

len = 

Term 
Intr 

5 0 Intr 

Intr 
Intr 
Intr 
Init 

55 

>2191126 
len = 
60 Term 



886 



16127 16011 - 0 

16674 16500 - 0 

/205500 
103 0 nex = 1 
52945 51919 - 0 

/31372 

1810 nex = 7 

58360 58263 - 0 

58757 58601 - 0 

59020 58889 - 0 

59232 59120 - 0 

59487 59328 - 0 

59636 59613 - 0 

60069 59921 - 0 

/38858 

1690 nex = 3 

85015 84307 - 0 

85533 85226 - 0 

85988 85678 - 0 

/28640 

1810 nex = 3 

105687 105961 + 0 

106179 106465 + 0 

106570 107490 + 0 

/1204 

1690 nex = 1 

110532 112213 + 0 

/41187 

2551 nex = 7 

115853 115628 - 0 

116345 116037 - 0 

116498 116418 - 0 

116825 116751 - 0 

116990 116904 - 0 

117192 117090 - 0 

118178 117485 - 0 

/21195 

2 56 6 nex = 7 

115853 115637 - 0 
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887 





Intr 


116345 


116037 




0 




Intr 


116498 


116418 




0 




Intr 


116825 


116751 


- 


0 




Intr 


116990 


116904 




0 


5 


Intr 


117192 


117090 




0 




Init 


118202 


117485 


_ 


0 




>2191126 


/19141 






10 


len = 






14 






Term 


25095 


24742 


_ 


0 




Intr 


25245 


25180 


_ 


0 




Intr 


25409 


25338 


_ 


0 


15 




25625 


25512 


_ 


0 




Intr 


25812 


25720 


_ 


0 




Intr 


25961 


25899 




0 




Intr 


26152 


26042 


- 


0 




Intr 


26360 


26247 




0 


20 


Intr 


26604 


26506 


- 


0 




Intr 


26756 


26691 




0 




Intr 


26948 


26853 




0 






27119 


27035 


_ 


0 




Intr 


27350 


27203 




0 


25 


Init 
ni 


28179 


28046 




0 




>2191126 


/22919 








len = 


1497 


nex = 


4 




30 














Init 


28448 


28746 


+ 


0 




Intr 


29035 


29235 


+ 


0 




Intr 


29321 


29463 


+ 


0 




Term 


29617 


29944 


+ 


0 


35 














>2191126 


/117191 








len 


253 


nex = 


1 




4 0 


Snal 


66070 


66322 




0 




>2 19112 6 


/7653 








len = 


1991 


nex = 


6 




45 














Term 


5264 


4896 


- 


0 




Intr 


5520 


5337 




0 




Intr 


5933 


5601 


- 


0 




Intr 


6260 


6123 




0 


50 




6458 


6329 




0 




Init 


6886 


6590 




0 




>2191126 


/41270 






55 


len = 


1008 


nex = 


4 






Term 


79378 


79212 




0 




Intr 


79532 


79461 




0 




Intr 


79730 


79623 




0 


60 


Init 


79919 


79819 




0 
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>2191126 



Term 
Intr 
Intr 
Intr 
Init 



1131 

79378 
79532 
79730 
79919 
80320 



nex = 

79190 
79461 
79623 
79819 
80142 



>2191126 



1177 nex 



Term 7 9378 

Intr 79532 

Intr 79730 

Intr 79919 

Init 80388 



79212 
79461 
79623 
79819 
80142 



Init 
Intr 
Intr 
Intr 
Intr 
Intr 
Intr 
Intr 
Intr 
Intr 
Intr 
Term 



2830 

90009 
90175 
90315 
90483 
90659 
90784 
91030 
91166 
91306 
91538 
91709 
91909 



nex = 

90076 
90222 
90410 
90563 
90700 
90947 
91078 
91214 
91445 
91615 
91834 
92323 



>2191157 75457 

len = 688 nex = 

Term 110545 110202 

Init 110889 110723 

>2191157 /39714 

len = 520 nex = 

Sngl 24526 25045 

>2191157 737336 

len = 1558 nex = 

Init 26629 26769 

Term 27064 27170 



>2191157 

60 



717739 
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2326 



Term 
Intr 
Intr 
Intr 
Intr 
Intr 
Intr 
Intr 
Intr 
Intr 
Init 

>219I157 

len = 

Init 
Intr 
Intr 
Intr 
Intr 
Intr 
Intr 
Intr 
Term 

>2191157 
len = 
Sngl 

>2191157 

len = 

Init 
Term 

>2191157 

len = 

Init 
Term 

>2191157 

len = 

Init 
Term 

>2191157 

len = 

Init 



1098 
1303 
1501 
1698 
1848 
2076 
2220 
2391 
2739 
2894 
3094 



904 
1201 
1418 
1603 
1798 
1936 
2164 
2317 
2467 
2835 
3002 



/21258 
2364 nex 



35554 
35854 
36017 
36362 
36622 
36794 
37265 
37474 
37753 



35767 
35917 
36231 
36538 
36696 
35895 
37376 
37620 
37793 



/42174 

540 nex = 

59287 59826 

727625 

732 nex = 

80900 81166 
81274 81631 

/41361 

2136 nex = 

83526 83731 
83861 84187 

732265 

2008 nex = 

83526 83731 
83861 84181 

72495 

2795 nex = 

92543 92875 
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Intr 
Intr 
Intr 
Term 

>2191181 

len = 

Init 
Intr 
Intr 
Term 

>2191181 

len = 

Term 
Intr 
Init 

>2191181 
len = 

>2213606 

len = 

Init 
Intr 
Intr 
Term 

>2213606 

len = 

Sngl 

>2213606 

len = 

Sngl 

>2213606 

len = 

Sngl 

>2213606 

len = 

Sngl 



93634 93776 

94054 94077 

94512 94714 

94965 95337 

/38304 

2070 nex = 

1742 2050 

2468 2686 

2758 2844 

3193 3219 

/23239 

98 8 nex = 

4337 3802 
4497 4418 
4789 4601 

/30935 
1455 nex = 

/6503 

1974 nex = 

15815 16171 

16373 16842 

16925 17188 

17281 17788 

/10990 

413 nex = 

18252 17840 

/38093 

490 nex = 

27032 27514 

/23231 

7 00 nex = 

45292 44593 

/31944 

559 nex = 

49930 49372 



60 >2244747 



/16846 
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Init 
Term 

>2244747 

len = 

Sngl 

>2244747 

len = 

Sngl 

>2244747 

len = 

Sngl 

>2244747 

len = 

Sngl 

>2244747 

len = 

Sngl 

>2244747 

len = 

Sngl 

>2244747 

len = 

Init 
Intr 
Intr 
Intr 
Intr 
Intr 
Intr 
Term 

>2244747 

len = 

Init 



12786 13565 
13854 14802 

738987 

134 nex = 

14762 14895 

/17977 

610 nex = 

16599 15997 

/19172 

610 nex = 

16614 16009 

/30129 

813 nex = 

176792 177114 

/195 

805 nex = 

176309 177113 

/101734 

340 nex = 

198899 199238 

/126389 

1776 nex = 



48741 
48995 
49141 
49296 
49486 
49614 
49983 
50189 



48903 
49057 
49207 
49396 
49530 
49895 
50085 
50516 



/25991 
1850 nex = 
48741 48903 



Reference No. 2750-942P 



Intr 
Intr 
Intr 
Intr 
Intr 
Intr 
Term 

>2244747 

len = 



48995 49057 

49141 49207 

49296 49396 

49486 49530 

49614 49895 

49983 50085 

50189 50590 

/99093 

430 nex = 



Init 
Intr 
Term 



48743 48903 
48995 49057 
49141 49172 



>2244747 

len = 

Sngl 

>2244747 

len = 

Sngl 

>2244747 

len = 

Term 
Init 



/7346 

550 nex = 

51305 50761 

/13520 

522 nex = 

53660 53139 

/18697 

817 nex = 

56326 55871 
56687 56413 



>2244747 
len = 



/35186 
1525 nex ■■ 



Term 
Intr 
Intr 
Intr 
Init 



56326 55870 

56685 56413 

55884 56777 

57220 56989 

57394 57303 



>2244747 
len = 



739975 
2277 nex ■ 



Term 
Intr 
Intr 
Intr 
Intr 
Intr 
Init 



56326 55859 

56685 56413 

56884 56777 

57220 56989 

57530 57303 

57816 57621 

58135 57936 

/1083Q8 



60 len = 



2306 nex = 



6 
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Term 
Intr 
Intr 
Intr 
Intr 
Init 

>2244747 

len = 

Init 
Intr 
Term 

>2244747 

len = 

Term 
Intr 
Intr 
Init 

>2244747 

len = 

Term 
Intr 
Init 

>2244747 

len = 

Term 
Intr 
Init 

>2244788 

len = 

Init 
Term 

>2244788 

len = 

Term 
Intr 
Init 



58896 58494 

59256 58984 

59446 59412 

59994 59535 

60270 60075 

60799 60608 

734967 

1692 nex = 

78644 78978 
79811 79967 
80055 80335 

729662 

2324 nex = 

6181 5707 

6376 6275 

6858 6468 

8030 7268 

710852 

948 nex = 

95484 95087 
95756 95563 
96034 95845 

733554 

1225 nex = 

95484 94981 
95756 95563 
96205 95845 

733860 

894 nex = 

119066 119340 
119433 119959 

74232 

1570 nex = 

11837 11610 
12997 12874 
13171 13086 



len 

GO 



1736 nex = 



Reference No. 2750-942P 



Init 134496 134633 

Intr 134785 134908 

Intr 135250 135306 

Term 135918 136231 



>2244788 
len = 



/4905 



1532 



nex ■ 



Init 
Intr 
Intr 
Term 

>2244788 

len = 



134547 134633 

134785 134908 

135250 135306 

135918 136078 

/18255 



1917 



nex 



Term 
Intr 
Intr 
Init 

>2244788 

len = 



11837 11553 

12997 12874 

13171 13086 

13469 13401 

742223 

1270 nex = 



Init 
Term 



141770 141970 
142713 143034 



>2244788 
len = 



Term 
Init 



>2244788 
len = 



172609 172540 
173404 172806 



932 



Init 
Intr 
Intr 
Term 



176283 176507 

176602 176703 

176785 176939 

176951 177214 



>2244788 

len = 

Init 
Intr 
Intr 
Intr 
Term 

>2244788 



/31495 

1150 nex = 

177820 177887 

178110 178208 

178295 178347 

178445 178518 

178797 178969 

/40073 



len = 

60 



17 61 nex = 



5 



Reference No. 2750-942P 



Term 182960 182681 

Intr 183144 183074 

Intr 183352 183228 

Intr 183544 183430 

Init 184441 183731 

44788 72738 



1855 



nex ■ 



Term 
Intr 
Intr 
Intr 
Intr 
Intr 
Init 



182960 
183144 
183352 
183544 
183825 
184012 
184555 



182701 
183074 
183228 
183430 
183731 
183901 
184343 



1337 nex = 



Term 
Intr 
Intr 
Init 



744 549 

903 829 

1232 1053 

1885 1804 



len ■ 



/16319 



Term 
Intr 
Intr 
Intr 
Intr 
Intr 
Init 



188526 
188710 
188914 
189112 
189340 
189532 
189945 



188214 
188640 
188790 
188998 
189246 
189421 
189850 



>2244788 
len = 



790 



nex ■ 



Term 
Intr 
Intr 
Init 



26188 26035 

26496 26276 

26702 26590 

26822 26779 



>2244788 
len = 



/37809 



2215 



nex 



Term 
Intr 
Intr 
Intr 
Intr 
Intr 
Intr 
Intr 



29960 
30139 
30309 
30490 
30687 
30881 
31057 
31236 



29503 
30054 
30235 
30388 
30606 
30790 
30969 
31156 
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Intr 
Init 

>2244788 

len = 

Term 
Intr 
Intr 
Intr 
Intr 
Init 

>2244788 

len = 

Init 
Intr 
Intr 
Intr 
Term 

>2244788 

len = 

Term 
Intr 
Intr 
Intr 
Init 

>2244788 

len = 

Term 
Intr 
Intr 
Intr 
Init 

>2244788 

len = 

Sngl 

>2244788 

len = 

Sngl 

>2244788 



31450 31335 
31717 31579 



1700 nex = 

45280 45046 

45431 45380 

45545 45518 

46149 46080 

46413 46313 

46745 46519 

/40736 

1713 nex = 

57948 58133 

58560 58765 

58850 58930 

59012 59174 

59262 59660 

/1718 

1844 nex = 

60276 59985 

60467 60369 

60644 60555 

60856 60742 

61828 61672 

/94503 

193 0 nex = 

60276 59949 

60467 60369 

60644 60555 

60856 60742 

61875 61672 

728978 

921 nex = 

63706 62786 

736844 

1309 nex = 

78815 80123 

742933 



len = 

60 



2960 nex = 



6 
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20 



25 



Init 
Intr 
Intr 
Intr 
Intr 
Term 



92232 
92959 
93567 
93831 
94438 
94602 



92765 
93121 
93743 
93914 
94519 
95191 



>2244829 /38042 
10 len = 2717 nex : 



Init 
Intr 
Intr 
Intr 
Intr 
Intr 
Intr 
Intr 
Term 



103735 
104329 
104545 
104833 
105212 
105486 
105738 
106013 
106159 



104049 
104423 
104609 
104876 
105295 
105639 
105920 
106069 
106451 



>2244829 7293 

len = 315 nex = 

Sngl 114012 113698 

>2244829 /40074 

30 len = 1498 nex = 



Term 115095 113973 
Init 115470 115294 



35 >2244829 



Term 115095 113698 
Init 115493 115294 



2190 nex -■ 



Init 
Intr 
Intr 
Intr 

50 Intr 
Intr 
Intr 
Term 

55 >2244829 
len = 



116378 
116787 
116953 
117143 
117526 
117791 
117992 
118269 



116531 
116872 
117024 
117180 
117569 
117837 
118166 
118567 



729288 
492 nex 



Sngl 131227 130736 

60 



0 



Reference No. 2750-942P 



>2244829 /24175 

len = 332 nex = 

5 Sngl 136899 137230 

>2244829 /17179 

len = 450 nex = 

10 

Sngl 136899 137332 

>2244829 799523 

15 len = 346 nex = 

Sngl 136900 137245 

>2244829 /37184 

20 

len = 624 nex = 

>2244829 /126602 

25 len = 654 nex = 

Sngl 136900 137553 

>2244829 /15384 

30 

len = 627 nex = 

Sngl 136904 137530 

35 >2244829 726797 

len = 628 nex = 

Sngl 136904 137531 

40 

>2244829 736129 

len = 739 nex = 

45 Sngl 199828 200566 

>2244829 724266 

len = 1908 nex = 

50 

Init 65354 65621 

Intr 65713 65836 

Term 66807 67261 

55 >2244829 731856 

len = 897 nex = 

Init 70117 70500 

60 Intr 70585 70611 
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Term 
>2244829 
5 len = 

Sngl 
>2244829 

10 

len = 
Sngl 

15 >2244829 
len = 
Term 

20 Intr 
Intr 
Intr 
Intr 
Intr 

2 5 Intr 
Intr 
Init 

>2244829 

30 

len = 
Sngl 

35 >2244829 
len = 
Term 

40 Intr 
Init 

>2244829 

45 len = 

Term 
Intr 
Intr 

5 0 Intr 
Intr 
Intr 
Intr 
Init 

55 

>2244870 
len = 



70696 71013 

/30327 

711 nex = 

82258 82968 

/33166 

650 nex = 

82303 82952 

742848 

2473 nex = 

83367 83062 

83556 83476 

83703 83644 

83890 83811 

84071 84020 

84306 84169 

84661 84398 

84799 84742 

84996 84887 

/22861 

611 nex = 

85902 86512 

/25333 

2115 nex = 

87340 86629 
87618 87443 
88743 87767 

/117350 

1760 nex = 

93545 93422 

93819 93710 

93998 93936 

94168 94094 

94368 94276 

94573 94469 

94861 94740 

95181 94950 

72163 

1517 nex = 



899 
+ 0 



+ 0 



0 



9 

0 
0 
0 
0 
0 
0 
0 
0 
0 



1 

+ 0 



3 

0 
0 
0 



8 

0 
0 
0 
0 
0 
0 
0 
0 



6 0 Sngl 



13507 15023 



0 
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>2244870 
len = 

5 

Init 
Intr 
Intr 
Intr 

10 Intr 
Intr 
Term 

>2244870 

15 

len = 

Term 
Init 

20 

>2244870 
len = 

2 5 Term 

Init 

>2244870 

3 0 len = 

Sngl 
>2244870 

35 

len = 
Sngl 

40 >2244870 
len = 
Sngl 

45 

>2244901 
len = 
50 Sngl 
>2244901 
len = 

55 

Init 
Term 

>2244901 

60 



/15641 

1853 nex = 

2352 2569 

2668 2781 

2862 2957 

3057 3099 

3174 3326 

3408 3476 

3843 4204 

/35290 

1090 nex = 

33366 33045 
34113 33943 

/18642 

8 67 nex = 

4431 4071 
4937 4513 

/30852 

513 nex = 

70945 70433 

/36205 

1210 nex = 

71644 70435 

/30929 

867 nex = 

84563 85414 

/32219 

644 nex = 

100297 100940 

/101301 

1235 nex = 

12251 12597 
13371 13485 

/15334 
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2089 



Init 
Intr 
Intr 
Term 



12251 12597 

13371 13484 

13678 13835 

13944 14339 



>2244901 
len = 



/14485 
1048 nex ^ 



Term 
Init 



136645 136202 
137249 136976 



>2244901 



/8916 
761 ne 



Init 
Term 



146636 146871 
146912 147396 



>2244901 
len = 



/22637 
1930 nex 



Init 
Intr 
Intr 
Intr 
Intr 
Intr 
Term 

>2244901 

Sngl 
>2244901 
len = 



150934 151112 

151807 151845 

151938 151991 

152091 152144 

152269 152322 

152417 152488 

152622 152862 

75455 

550 nex = 

153514 154059 

/25390 

1731 nex = 



Term 
Intr 
Init 



156239 156216 
156385 156332 
157099 156997 



>2244901 
len = 



/39757 
1489 nex 



Term 
Intr 
Intr 
Intr 
Init 



164193 163773 

164487 164293 

164750 164603 

164938 164832 

165261 165017 



60 len = 



250 nex = 
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Sngl 

>2244901 

len = 

Init 
Intr 
Intr 
Intr 
Intr 
Intr 
Intr 
Intr 
Term 

>2244901 

len = 

Init 
Term 

>2244901 
len = 
Sngl 

>2244901 
len = 



165261 165021 

/43007 

3418 nex = 

181307 182180 

182482 182558 

182639 182732 

182817 182915 

183212 183301 

183400 183519 

183767 183870 

184163 184235 

184397 184724 

/8381 

928 nex = 

197128 197392 
197699 198055 

735383 

1690 nex = 

23032 21343 

/12451 

2050 nex = 



Init 
Intr 
Intr 
Term 

>2244901 

len = 

Term 
Intr 
Intr 
Init 

>2244901 

len = 



29261 29459 

29681 29785 

29969 30397 

30959 31303 



78234 



855 

33518 
33802 
34017 
34150 



nex = 

33296 
33633 
33880 
34103 



733073 
3028 nex 



Init 
Term 

>2244901 

len = 

Init 



4164 4631 
6071 7191 

7307 

1838 nex = 

44565 
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Intr 
Intr 
Intr 
Intr 
Intr 
Intr 
Term 



44976 45044 

45145 45198 

45288 45327 

45414 45512 

45595 45819 

45902 46023 

46120 46402 



/19122 



len = 

Init 
Intr 

15 Intr 
Intr 
Intr 
Intr 
Intr 

2 0 Term 



17 66 nex 



44638 
44976 
45145 
45288 
45414 
45595 
45902 
46120 



44888 
45044 
45198 
45327 
45512 
45819 
46023 
46403 



>2244901 
len = 



737345 
1379 nex 



Init 
Intr 
Term 

30 >2244901 
len = 



55027 55308 
55387 55671 
55759 56179 

/26019 

1750 nex = 



Init 

3 5 Intr 

Term 

>2244901 

4 0 len = 

Init 
Term 



77747 78039 

78780 78906 

79065 79492 

/933 

1415 nex = 

86075 86413 

86998 87489 



45 >2244950 
len = 



3346 



Term 
Intr 
Intr 
Intr 
Intr 
Intr 
Intr 
Intr 
Intr 
Init 



100982 
101466 
101718 
102002 
102439 
102690 
102958 
103205 
103432 
103970 



100625 
101106 
101591 
101874 
102360 
102527 
102773 
103074 
103291 
103568 



60 



>2244950 /40414 
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len = 



Term 
Intr 
Intr 
Intr 
Intr 
Init 



109338 
109551 
109708 
109850 
110001 
111043 



109067 
109489 
109646 
109803 
109939 
110961 



2050 nex = 



Term 
Intr 
Intr 
Intr 
Intr 
Init 

>2244950 

len = 

Init 
Intr 
Intr 
Intr 
Intr 
Intr 
Term 

>2244950 

len = 

Init 
Intr 
Intr 
Term 

>2244950 

Init 
Term 

>2244950 
len = 
Sngl 

>2244950 
len = 



109338 
109551 
109708 
109850 
110001 
111043 



109187 
109489 
109646 
109803 
109939 
110961 



/5714 

1403 nex = 

124186 124326 

124418 124469 

124596 124670 

124766 124794 

124968 125001 

125082 125152 

125251 125588 

/33513 

1593 nex = 

138127 138644 

138739 138858 

138934 139180 

139256 139719 

/19028 

638 nex = 

139024 139180 
139256 139661 

/21894 

1030 nex = 

146832 145803 

/7605 

814 nex = 



60 Term 167332 166714 



0 
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init 167527 167451 

>2244950 /3176 

len = 1423 nex = 

Term 167332 166764 

Init 167934 167451 

>2244950 /41791 

len = 1479 nex = 

Term 167332 166712 

Intr 167934 167451 

Init 168190 168116 

>2244950 /12256 

len = 1716 nex = 

Term 169269 169015 

Intr 169606 169448 

Intr 170335 170260 

Init 170730 170607 

>2244950 76723 

len = 1536 nex = 

Init 171676 171958 

Intr 172224 172415 

Intr 172496 172661 

Term 172740 173211 

>2244950 /124835 

len = 978 nex = 

Sngl 18831 19808 

>2244950 /40793 

len = 1247 nex = 

Term 193189 192906 

Intr 193587 193266 

Init 194152 193673 

>2244950 /2803 

len = 1824 nex = 

Init 2896 3184 

Intr 3571 3676 

Term 4403 4719 

>2244950 /9209 

len = 573 nex = 
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Sngl 
>2244950 

Sngl 

>2244950 

len = 

Init 
Intr 
Intr 
Intr 
Intr 
Intr 
Term 

>2244950 

len = 

Init 
Intr 
Intr 
Intr 
Intr 
Term 



31137 30565 

729655 

6 82 nex = 

34486 35167 

/40913 

2 079 nex = 

4949 5128 

5254 5419 

5498 5550 

5911 5973 

6366 6416 

6516 6630 

6687 7027 

/18234 

1950 nex = 



61059 
61420 
61714 
61882 
62016 
62293 



61335 
61550 
61791 
61926 
62060 
62389 



Init 
Intr 
Intr 
Intr 
Intr 
Term 

>2244950 
len = 
Sngl 

>2244950 

len = 

Term 
Init 



1510 nex = 

7376 7454 

7542 7577 

7707 7844 

7939 8012 

8418 8486 

8556 8884 

/31782 

1211 nex = 

84183 82973 

/17019 

2897 nex = 

84672 82981 

85877 85235 



len = 

60 



397 nex = 



1 
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Sngl 

>2244991 

len = 

Term 
Intr 
Intr 
Intr 
Init 

>2244991 
len = 
Sngl 

>2244991 
len = 



95604 96000 
/7101 
1300 nex = 



99473 
99674 
99851 
100015 



99160 
99597 
99788 
99939 



100216 100170 
/14136 
1251 nex = 
133001 131751 
/24611 
1275 nex = 



Init 
Intr 
Intr 
Intr 
Intr 
Term 



144816 144916 
144996 145065 



145153 
145299 
145408 
145593 



145209 
145360 
145507 
145964 



>2244991 
len = 



/5546 
1163 nex ^ 



Term 
Intr 
Init 

>2244991 
len = 

>2244991 
len = 



Init 
Intr 
Term 

>2244991 

len = 



157187 156808 
157430 157305 
157970 157545 

/8212 

1254 nex = 

/40778 

8 79 nex = 

163368 163492 
163658 163757 
163863 164240 

/23771 

1377 nex = 



Term 
Intr 
Intr 
Init 



164902 164507 

165186 164989 

165666 165500 

165883 165813 



>2244991 

60 



/16525 
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Init 
Term 



>2244991 
len = 



172277 172503 
172604 173086 



/22084 
1450 nex 



Init 
Term 

>2244991 
len = 
Sngl 

>2244991 

len = 

Term 
Intr 
Intr 
Intr 
Intr 
Intr 
Intr 
Intr 
Intr 
Init 

>2244991 
len = 
Sngl 

>2244991 

len = 

Term 
Intr 
Init 

>2244991 

Term 
Intr 
Intr 
Intr 
Intr 
Init 



177203 177333 
177407 177827 

/157870 

342 nex = 

17882 17541 



2453 

194540 
194759 
194888 
195027 
195163 
195344 
195623 
195980 
196138 
196848 



nex = 

194396 
194680 
194843 
194971 
195105 
195244 
195502 
195929 
196058 
196213 



/2505 

623 nex = 

27093 26471 

/7632 

1210 nex = 

36794 36385 
37205 37073 
37590 37308 

/30471 

1883 nex = 

39363 38946 

39486 39437 

39651 39570 

39806 39736 

40168 40098 

40371 40292 



/17535 
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len = 

Sngl 

>2244991 

len = 

Init 
Term 

>2244991 

len = 

Init 

>2244991 

len = 

Sngl 

>2244991 

len = 

Sngl 

>2244991 

len = 

Term 
Intr 
Intr 
Intr 
Init 

>2244991 

len = 

Sngl 

>2245031 

len = 

Sngl 

>2245031 

len = 

Sngl 

>2245031 



58 5 nex = 
43288 43872 
/17553 
62 8 nex = 



44575 
44876 



44786 
45202 



/16090 

634 nex = 

44583 44786 
44876 45216 

/31946 

562 nex = 

66524 65963 

/6580 

509 nex = 

70265 69757 

/17851 

1752 nex = 



71484 
71754 
71898 
72484 
72626 



71210 
71636 
71846 
72429 
72579 



/92054 
587 nex = 
8564 9150 
/92144 
444 nex = 
125198 125641 
/30087 
822 nex = 
125198 126019 
/118011 
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Sngl 125287 125641 
5 >2245031 /91870 



len ■ 



1970 



Init 144106 144256 

Intr 144641 144768 

Intr 145143 145253 

Term 145583 146075 

45031 /36017 

len = 3647 nex = 



Term 
Intr 
Intr 
Intr 
Intr 
Intr 
Intr 
Init 



154141 
155021 
155252 
155661 
155955 
156204 
156561 
157572 



153926 
154948 
155139 
155584 
155829 
156149 
156358 
157241 



3010 nex ■ 



Init 
Intr 
Intr 
Intr 
Intr 
Intr 
Intr 
Intr 
Intr 
Intr 
Intr 
Term 



157780 
157993 
158517 
158708 
159068 
159412 
159590 
159798 
159938 
160067 
160354 
160554 



157908 
158125 
158604 
158784 
159107 
159497 
159671 
159854 
159976 
160137 
160407 
160780 



>2245031 



3018 nex ■ 



Init 
Intr 
Intr 
Intr 
Intr 
Intr 
Intr 
Intr 
Intr 
Intr 
Term 



157780 
157993 
158517 
158708 
159068 
159590 
159798 
159938 
160067 
160354 
160554 



157908 
158125 
158604 
158784 
159497 
159671 
159854 
159976 
160137 
160407 
160797 



60 >2245031 



/110681 
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Init 
Term 

>2245031 
len = 
Sngl 

>2245031 

len = 

Init 
Intr 
Intr 
Term 

>2245031 

len = 

Term 
Intr 
Intr 
Intr 
Init 

>2245031 
len = 
Sngl 

>2245031 

len = 

Init 
Intr 
Term 

>224503I 

len = 

Sngl 

>2245031 

len = 

Sngl 

>2245073 

len = 



172709 172801 
172906 173174 

/142850 

610 nex = 

173847 173242 

/42533 

1533 nex = 

17415 17650 

17764 18062 

18331 18410 

18499 18947 

736882 

2 299 nex = 

173963 173241 

174262 174007 

174516 174406 

174824 174614 

175539 174923 

/14613 

6 73 nex = 

20501 19829 

/831 

85 0 nex = 

39954 40111 
40198 40248 
40330 40796 

/14223 

63 8 nex = 

43095 43370 

735772 

1663 nex = 

48986 49948 

7158661 

739 nex = 
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Sngl 


102245 


101507 


- 


0 




>2245073 


/34167 






5 














len - 


1019 


nex = 


3 






Init 


104868 


105196 


+ 


0 




Intr 


105282 


105361 


+ 


0 


10 




105463 


105866 


+ 


0 




>2245073 


/36603 








len = 


4481 




11 




15 














Term 


6893 


6584 




0 




Intr 


7287 


7083 




0 




Intr 


7700 


7618 


- 


0 




Intr 


8129 


7990 




0 


20 


Intr 


8424 


8266 


_ 


0 




Intr 


9480 


8479 




0 




Intr 


9839 


9542 


_ 


0 




Intr 


10132 


9928 




0 






10433 


10351 






25 


Intr 


10748 


10609 




0 




Init 


11064 


10945 




0 




>2 24 5 07 3 


737223 






3 0 


len 


4483 


nex = 








Term 


6893 


6584 








Intr 


7287 


7083 




0 




Intr 


7700 


7618 


- 


0 


35 


Intr 


8129 


7990 


- 


0 






8424 


8266 








Intr 


9480 


8479 




0 




Intr 


9839 


9542 


- 


0 




Intr 


10132 


9928 




0 


40 


Intr 


10433 


10351 


: 


0 




Intr 


10748 


10609 




0 




Init 


11066 


10945 


_ 


0 




>2245073 


/6042 






45 














len - 


959 










1 

ng 


124096 


125054 




0 


50 


>2243073 


/35156 








len 


2133 




7 






Init 


136139 


136418 


+ 


0 


55 


Intr 


136654 


136948 


+ 


0 




Intr 


137036 


137101 


+ 


0 




Intr 


137200 


137329 


+ 


0 




Intr 


137421 


137579 


+ 


0 




Intr 


137703 


137753 


+ 


0 


60 


Term 


137855 


138271 


+ 


0 
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>2245073 



/154342 



len = 111 nex = 

5 

Sngl 140364 140254 

>2245073 /3258 

10 len = 2050 nex = 



Term 
Intr 
Intr 
Intr 
Intr 
Intr 
Intr 
Init 



138586 
138787 
139039 
139188 
139338 
139469 
139680 
140370 



138326 
138684 
138884 
139117 
139291 
139422 
139608 
140183 



1690 



nex 



Init 145051 145144 

Intr 145227 145544 

Intr 145712 145798 

Intr 145888 146021 

Term 146416 146733 

45073 /17120 



Init 145081 145144 
Term 145227 145544 



Init 168520 168924 

Intr 169023 169160 

Term 169230 169591 

>2245073 /23025 

len = 2715 nex = 



Init 
Intr 
Intr 
Intr 
Intr 
Intr 
Intr 
Terra 



181224 
181935 
182407 
182789 
183152 
183325 
183502 
183704 



181382 
181992 
182489 
183061 
183204 
183405 
183614 
183938 



>2245073 

60 



/19505 
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len = 2035 nex = 

Init 189969 190426 

Intr 190764 190988 

Intr 191116 191225 

Term 191315 191480 

>2245073 /31781 

len = 1939 nex = 

Init 190050 190426 

Intr 190764 190988 

Intr 191116 191225 

Term 191315 191480 

>2245073 /36521 

len = 730 nex = 

Sngl 190098 190332 

>2245073 739872 

len = 213 5 nex = 

Init 192291 192840 

Intr 193297 193492 

Intr 193589 193720 

Term 194093 194425 

>2245073 /6709 

len = 1058 nex = 

Term 198909 198442 

Init 199499 199146 

>2245073 794923 

len = 739 nex = 

Init 20607 20828 

Term 20918 21345 

>2245073 724997 

len = 53 0 nex = 

Sngl 26357 25828 

>2245073 733509 

len = 1450 nex = 

Init 38766 39446 

Term 39638 40214 
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Term 
Intr 
Init 



43961 43535 
44176 44048 
45091 44398 



>2245073 
len - 
Sngl 

>2245073 
len = 



/27500 
1700 nex = 
51675 51471 
/99796 



Init 
Term 



64024 64466 
64647 65171 



>2245073 
len = 



1278 nex ■■ 



Term 
Intr 
Intr 
Init 



79423 79066 

79725 79528 

80213 80047 

80343 80313 



726448 



2146 



nex 



Term 
Intr 
Intr 
Intr 
Intr 
Intr 
Intr 
Init 



87600 
87818 
88211 
88333 
88636 
88765 
88913 
89406 



87509 
87699 
88116 
88295 
88458 
88726 
88854 
89167 



>2245126 
len = 



nex 



Term 
Intr 
Intr 
Intr 
Init 



28671 
28825 



29183 
29950 



27817 
28745 
28913 
29080 
29830 



len = 

Init 
Intr 
Intr 
Intr 
Intr 



1873 nex ■■ 



30483 
30977 
31153 
31365 
31521 



30887 
31070 
31292 
31439 
31678 
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Intr 31762 31823 
Term 31972 32355 



>2245126 

len = 

Init 
Intr 
Intr 
Term 



/42815 

1514 nex = 

56618 56988 

57254 57524 

57621 57791 

57887 58131 



>2252639 

15 len = 

Term 
Intr 
Intr 

2 0 Intr 
Intr 
Intr 
Intr 
Intr 

2 5 intr 
Intr 
Intr 
Init 

30 >2252639 



/36439 



2305 nex 



112752 
112953 
113158 
113355 
113539 
113704 
113928 
114069 
114227 
114489 
114748 
114983 



112679 
112837 
113042 
113254 
113444 
113623 
113814 
114018 
114147 
114328 
114572 
114885 



817 6 nex = 



Term 
Intr 
Intr 
Intr 
Intr 
Intr 
Init 



112752 
112953 
113158 
113355 
113539 
113704 
113928 



112549 
112837 
113042 
113254 
113444 
113623 
113814 



>2252639 
len = 



Init 


55275 


55373 


+ 


Intr 


55679 


55864 


+ 


Intr 


55943 


56072 


+ 


Intr 


56168 


56248 


+ 


Intr 


56342 


56529 


+ 


Intr 


56624 


56719 


+ 


Intr 


56822 


56915 


+ 


Intr 


57043 


57162 


+ 


Term 


57257 


57336 


+ 



>2252639 
len = 
Init 



742847 
2459 nex = 
64066 64204 
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Intr 
Term 

>2252639 

len = 

Sngl 

>2252639 

len = 

Sngl 

>2252639 

len = 

Sngl 

>2252639 

len = 

Term 
Intr 
Intr 
Intr 
Init 

>2252639 

len = 

Term 
Intr 
Intr 
Intr 
Intr 
Init 

>2252639 
len = 
Sngl 

>2252639 

len = 

Term 
Intr 
Init 



65296 65804 
65895 66271 

/20756 

561 nex = 

66935 66375 

/8355 

619 nex = 

67016 66406 

/104398 

114 nex = 

67655 67768 

734829 

1550 nex = 



72152 
72324 
72574 
72867 
73235 



71686 
72213 
72402 
72664 
73005 



734276 
2157 nex = 



76139 
76346 
76530 
76771 
76952 
77979 



75823 
76218 
76444 
76626 
76898 
77037 



711108 

539 nex = 

79342 78804 

71269 

1433 nex = 

79851 79679 
80212 80012 
80700 80396 



len = 

60 



8 35 nex = 



3 
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Init 
Intr 
Term 

>2252639 

len = 

Init 
Intr 
Term 

>2252639 

len = 

Init 
Intr 
Term 

>2252639 

len = 

Init 
Intr 
Term 

>2252639 

len = 

Init 
Intr 
Term 

>2252639 

len = 

Term 
Init 

>2252823 

len = 

Sngl 

>2252823 

len = 

Sngl 

>2252823 

len = 

Sngl 



85064 85271 
85376 85455 
85554 85898 

735833 

8 73 nex = 

85064 85271 
85376 85455 
85554 85936 

71810 

878 nex = 

85064 85271 
85376 85455 
85554 85941 

717857 

910 nex = 

85064 85271 
85376 85455 
85554 85972 

710862 

864 nex = 

85068 85271 
85376 85455 
85554 85931 

722773 

2008 nex = 

92196 90691 
92698 92411 

711106 

1289 nex = 

107171 108459 

725765 

315 nex = 

1671 1357 

738970 

2486 nex = 

29968 30145 
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>2252823 

len = 

Init 
Intr 
Term 

>2252823 

len = 

Init 
Intr 
Term 

>2252823 

len = 

Sngl 

>2252823 

len = 

Sngl 

>2252823 

Term 
Intr 
Init 

>2252823 

len = 

Term 
Intr 
Init 

>2252823 

len = 

Init 
Term 

>2252848 

len = 

Sngl 

>2252848 



/15741 

3070 nex = 

29968 30145 
30436 30547 
30642 31104 

728637 

2900 nex = 

35493 36349 
36852 37326 
37673 38392 

721038 

4 95 nex = 

37895 38389 

735506 

582 nex = 

50035 49454 

739479 

1604 nex = 

56402 56064 
57185 56486 
57649 57493 

736326 

2 3 92 nex = 

64455 64054 
64734 64625 
65205 64824 

731027 

1150 nex = 

94085 94153 
94219 95230 

7111719 

733 nex = 

46064 45332 

711036 
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len = 

Sngl 

>2252848 

len = 

Sngl 

>2252848 

len = 

Sngl 

>2252848 

len = 

Sngl 

>2252848 

len = 

Init 
Intr 
Term 

>225284B 

len = 

Init 
Intr 
Term 

>2252848 

len = 

Term 
Intr 
Intr 
Init 

>2252848 

len = 

Term 
Intr 
Init 

>2262097 

len = 

Init 



7 90 nex = 

46089 45304 

/3204 

833 nex = 

60597 61429 

/22161 

670 nex = 

63070 63731 

/22348 

740 nex = 

65608 64869 

/28082 

1216 nex = 

80915 80991 
81337 81552 
81645 81897 

726442 

1210 nex = 

80915 80991 
81337 81552 
81645 81895 

/37305 

1575 nex = 

91905 91570 

92168 92002 

92528 92246 

92758 92613 

/37175 

2 05 0 nex = 

95449 94674 
95668 95551 
96720 96101 

/22611 

14 39 nex = 

31 168 
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Intr 
Intr 
Term 

>2262097 

len = 

Term 
Init 

>2262097 

len = 

Term 
Intr 
Intr 
Intr 
Intr 
Init 

>2262097 

len = 

Term 
Intr 
Intr 
Init 

>2262135 

len = 

Term 
Init 

>2262135 

len = 

Term 
Init 



253 403 
481 885 
969 1469 

737663 

1694 nex = 

48814 47723 
49413 49234 

/37704 

19 90 nex = 

4521 4199 

4778 4665 

5379 5207 

5540 5489 

5680 5632 

6186 5782 

/112955 

2350 nex = 

89371 88825 

89563 89456 

89803 89654 

91172 90509 

/41490 

1454 nex = 

2318 1916 
3369 2625 

/20167 

1304 nex = 

4241 3765 
5068 4768 



>2262135 

len = 

Term 
Intr 
Init 

>2262I35 

len = 



/32291 

1390 nex = 

3887 3685 
4241 4100 
5072 4768 

76568 

2212 nex = 



Term 55501 55152 
Intr 55716 55591 
Intr 55868 55793 
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Intr 
Intr 
Init 

>2262135 

len = 

Init 
Intr 
Intr 
Term 

>2262135 
len = 
Sngl 

>2262135 

len = 

Init 
Intr 
Intr 
Intr 
Intr 
Term 

>2262135 
len = 
Sngl 

>2262135 

len = 

Init 
Intr 
Intr 
Term 

>2262135 

len = 

Sngl 

>2262135 

len = 

Sngl 

>2262135 

len = 



56088 55950 
56564 56483 
57080 56653 

/10207 

2063 nex = 

59951 60024 

60681 60762 

61016 61098 

61517 61813 

/18545 
647 nex = 
6145 6791 

/4346 
2939 nex = 



70603 
71555 
71842 
71994 
72734 
72893 



71150 
71677 
71907 
72059 
72814 
73541 



/26127 
817 nex = 
10051 10199 
/8114 
1879 nex = 

97068 97416 
98158 98297 
98468 98540 
98650 98946 

/34186 

3 47 nex = 

97069 97415 
/I45375 

319 nex = 
10051 10164 
/18454 
354 nex = 
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Sngl 
>2262135 
len = 



10051 10199 



Init 
Intr 
Term 



99470 99712 
99822 99870 
99982 100642 



>2262155 



15 



len = 
Sngl 
>2262155 
20 len = 



23119 22463 
738365 
2443 nex = 



Term 
Intr 
Intr 
Intr 
Intr 
Intr 
Intr 
Intr 
Intr 
Intr 
Init 



33741 
33874 
34038 
34207 
34357 
34542 
35004 
35174 
35320 
35536 
36051 



33609 
33812 
33961 
34130 
34283 
34456 
34864 
35106 
35254 
35471 
35849 



>2262155 
len = 



2710 nex 



Term 
Intr 

4 0 Intr 
Intr 
Intr 
Intr 
Intr 

4 5 Intr 
Intr 
Intr 
Intr 
Init 

50 

>2262155 



41819 
42007 
42177 
42353 
42507 
42691 
42920 
43144 
43300 
43448 
43690 
44238 



41536 
41945 
42100 
42276 
42433 
42605 
42792 
43004 
43232 
43382 
43625 
44044 



len : 



1776 



nex : 



Init 
Intr 
Intr 
Term 



47118 47195 

47279 47459 

47575 47672 

47837 48384 



60 >2262155 



/13246 
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len = 


1990 nex = 


6 








Init 


54079 54165 








5 


Intr 


54255 54346 


+ 


0 






Intr 


54432 54540 










Intr 


54640 54675 


+ 


0 






Intr 


54764 54850 


+ 


0 






Term 


54940 55113 


+ 


0 




10 














>2262155 


734698 










len = 


1459 nex = 


6 






15 


Init 


56211 56260 




0 






Intr 


56344 56556 


+ 


0 






Intr 


56654 56802 


+ 


0 






Intr 


56878 57034 


+ 


0 






Intr 


57160 57252 


+ 


0 




20 


Term 


57530 57669 




0 






>2262155 


/39211 






k] 




len = 


2110 nex = 


2 






25 














Init 


64477 65546 


+ 


0 






Term 


66273 66579 


+ 


0 






>2262155 


/19601 








30 










ill 




len = 


2050 nex = 


2 








Init 


64534 65546 


+ 


0 


w 




Term 


66273 66579 


+ 


0 




35 














>2262155 


/32751 










len = 


850 nex = 


1 






40 


Sngl 


77445 76604 


- 


0 






>2262I55 


73276 










len = 


1167 nex = 


1 






45 














Sngl 


8628 9794 


+ 


0 






>2264302 


738370 








50 


len = 


1450 nex = 


2 








Term 


35101 34004 




0 






Init 


35452 35188 




0 




55 


>2264302 


79562 










len = 


2074 nex = 


0 





>2264302 

60 



728046 
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1581 nex ^ 



Term 
Intr 
Intr 
Init 



51719 51257 

52040 51910 

52474 52402 

52837 52724 



/16428 



1571 nex ■■ 



Term 
Intr 
Intr 
Init 



51818 51294 

52040 51910 

52474 52402 

52864 52724 



/100085 



Term 
Intr 
Init 



5287 4881 
5613 5357 
6134 5782 



>2264303 
len = 



1735 



nex ■■ 



Init 
Intr 
Intr 
Intr 
Intr 
Term 



14289 14642 

14799 14910 

15002 15095 

15228 15405 

15488 15557 

15638 16023 



>2264303 



/7145 



824 



Init 
Intr 
Intr 
Term 



3387 3465 

3544 3666 

3754 3870 

3947 4205 



>2264303 
len = 



74273 
1845 nex ■ 



Term 
Intr 
Init 



45044 44650 
45266 45126 
46494 46178 



>2264303 
len = 



14 69 nex ■■ 



Init 
Intr 
Intr 
Term 



58748 59002 

59229 59277 

59634 59833 

59930 60216 
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>2264303 
len = 

5 

Term 
Intr 
Intr 
Init 

10 

>2264304 

len = 

15 Init 
Intr 
Intr 
Intr 
Term 

20 

>2264304 

len = 

25 Term 
Intr 
Intr 
Intr 
Init 

30 

>2264304 

len = 

35 Init 
Intr 
Intr 
Intr 
Term 

40 

>2264304 

len = 

Init 
Term 



1825 nex = 

64023 63682 

64570 64473 

65089 64989 

65506 65289 



45 



>2254304 
50 len = 

Sngl 
>2264304 
len = 



/34402 



2558 



nex 



55 



20281 20902 

21285 21510 

21627 21849 

22104 22317 

22554 22838 

734783 

20 7 5 nex = 

23983 23714 

24174 24080 

24709 24267 

25149 24793 

25788 25400 

/39319 

187 0 nex = 

2871 2989 

3690 3771 

3960 4165 

4328 4381 

4476 4733 

/9159 

1570 nex = 

41803 42064 
42974 43372 

738464 

1270 nex = 

51034 52303 

728578 

2110 nex = 



Init 515 1139 

Intr 1407 1504 
60 Intr 1754 1853 
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Intr 
Term 

>2264304 
len = 
Sngl 

>2264304 

len = 

Init 
Term 

>2264304 

len = 

Sngl 

>2264304 

len = 

Sngl 

>2254304 

len = 

Sngl 

>2264304 

len = 

Sngl 

>2264304 

len = 

Sngl 

>2264304 

len = 

Init 
Intr 
Intr 
Intr 
Intr 
Intr 
Intr 
Intr 
Term 



2027 2272 
2358 2618 

/41195 
353 nex = 
57898 57549 

/2871 

43 0 nex = 

6595 6647 
6733 7019 

/30073 

1810 nex = 

65320 65030 

/32071 

1128 nex = 

67814 67283 

/103464 

109 6 nex = 

67814 67316 

/17818 

1136 nex = 

67814 67277 

/24095 

59 6 nex = 

72223 72818 

/111741 

2 89 8 nex = 

77610 77692 

78044 78153 

78600 78734 

78876 79022 

79400 79483 

79589 79635 

79729 79802 

79915 79973 

80152 80212 
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>2264305 

Init 
Intr 
Intr 
Intr 
Term 

>2264305 

len = 

Term 
Intr 
Init 

>2264305 

len = 

Term 
Intr 
Intr 
Init 

>2264305 

len = 

Term 
Init 

>2264305 

len = 

Term 
Intr 
Intr 
Init 

>2264305 
len = 
Sngl 

>2264305 

len = 

Init 
Intr 
Intr 
Term 



1493 



nex 



31119 31386 

31604 31784 

31864 32005 

32090 32159 

32249 32611 

/98400 

993 nex = 

4415 4173 
4868 4742 
5152 4965 

736333 

1450 nex = 

4415 4119 
4868 4742 
5244 4965 
5422 5374 

/121728 

550 nex = 

5244 5080 
5422 5374 



1312 



nex 



4415 4326 

4868 4742 

5244 4965 

5422 5374 

724983 

599 nex = 

64677 64079 

716865 

1615 nex = 

71009 71096 

71447 71574 

71737 71841 

72035 72347 



>2264305 

60 



735698 
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1150 



Init 
Intr 
Intr 
Term 

>2264306 

len = 



71025 71096 

71447 71574 

71737 71841 

72035 72162 

/21505 

1450 nex = 



Term 
Intr 
Init 



10517 10132 
11048 10721 
11577 11259 



>2264306 
len = 



Term 
Init 



14439 14066 
14777 14527 



>2264306 

len = 

Term 
Intr 
Init 



1450 nex = 

14439 13966 

14854 14527 

15411 14979 



>2264306 
len = 
Sngl 

>2264306 

len = 

Term 
Intr 
Intr 
Intr 
Intr 
Intr 
Intr 
Intr 
Init 



2596 2928 

739888 
2203 nex = 



35099 
35279 
35475 
35651 
35855 
36011 
36218 
36369 
36846 



34644 
35181 
35371 
35559 
35763 
35958 
36117 
36295 
36503 



1417 nex 



Init 
Intr 
Intr 
Intr 
Term 



41110 
41333 
41763 
42120 
42324 



41228 
41424 
41818 
42181 
42526 
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>2264306 
len = 



/3699 
1897 nex = 



Init 
Intr 
Intr 
Term 



5030 5266 

5420 6238 

6325 6526 

6551 6926 



>2264306 

len = 

Term 
Intr 
Init 

>2264306 

len = 



/6637 

1428 nex = 

80382 79690 
80764 80484 
81117 80852 

/111669 

382 nex = 



Init 
Term 



88535 88581 
88664 88916 



>2264307 



Term 
Init 



/42441 
682 nex 



48650 48344 
49017 48966 



>2264307 
len = 



722848 
658 nex ■ 



48650 48368 
49017 48966 



>2264307 
len = 



Term 
Init 



/145394 
638 nex 



48650 48388 
49017 48966 



>2264307 
len = 



/11511 
77 6 nex ■■ 



Term 
Init 



>2264307 
len = 



Term 
Init 



48650 48252 
49027 48966 



/12330 
6 70 nex 



48650 48363 
49017 48966 



60 >2264307 



737668 
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2959 



nex ^ 



Term 
Intr 
Intr 
Intr 
Intr 
Intr 
Intr 
Intr 
Intr 
Intr 
Intr 
Init 



58676 
58819 
59006 
59148 
59415 
59547 
59753 
60223 
60499 
60688 
60911 
61393 



58435 
58762 
58939 
59089 
59374 
59504 
59684 
60104 
60481 
60616 
60847 
61056 



>2264307 
len = 



/24058 
1653 nex 



Init 
Intr 
Intr 
Term 

>2264308 
len = 
Sngl 

>2264308 
len = 



72492 72816 

73287 73411 

73485 73593 

73888 74144 

/1935 
1396 nex = 
17599 16204 

/22483 
2 981 nex = 



Term 
Intr 
Intr 
Intr 
Intr 
Intr 
Intr 
Init 

>2264309 
len = 
Sngl 

>2264309 
len = 



4792 
5296 
5495 
5737 
6028 
6224 
6544 
7396 



4416 
4866 
5375 
5588 
5823 
6110 
6307 
7131 



737959 
357 nex = 
16800 16444 
/15155 
872 nex = 



Init 
Term 



22581 
22927 



22830 
23337 



60 len = 



4030 nex = 
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Term 
Intr 
Intr 
Intr 
Intr 
Intr 
Intr 
Init 

>2264309 
len = 
Sngl 

>2264309 

len = 

Init 
Intr 
Intr 
Intr 
Intr 
Intr 
Intr 
Intr 
Intr 
Term 

>2264310 
len = 
Sngl 

>2264310 

len = 

Term 
Intr 
Intr 
Intr 
Intr 
Init 

>2264310 
len = 
Sngl 

>2264310 
len = 



23729 
23957 
24155 
24319 
24499 
26484 
26721 
27488 



23461 
23827 
24049 
24241 
24413 
26236 
26572 
26913 



/109246 
614 nex = 
36598 37211 
734868 
2755 nex = 



56456 
57170 
57346 
57612 
57802 
58009 
58236 
58523 
58667 
58834 



56771 
57262 
57427 
57708 
57877 
58067 
58358 
58580 
58752 
59210 



/99461 
692 nex = 
11215 11906 
/15761 
2548 nex = 



19001 
19291 
19675 
19965 
20557 
21233 



18686 
19099 
19440 
19793 
20507 
20635 



/11083 
565 nex = 
2390 2039 
/31527 



589 



nex 



Sngl 45291 45879 

60 



0 
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>2264310 
len = 
Sngl 

>2264310 
len = 



1961 



Init 
Intr 
Intr 
Intr 
Term 



8184 8440 

8574 8786 

8879 9037 

9616 9684 

9797 10144 



>2264311 



732868 



len = 

Term 
Intr 
Intr 
Intr 
Init 



1724 nex = 

22845 22268 

23036 22924 

23230 23115 

23684 23307 

23977 23868 

76256 



Term 
Intr 
Init 

35 >2264311 
len = 
Term 

4 0 Intr 
Intr 
Intr 
Intr 
Intr 

45 Intr 
Init 



61688 61655 
61915 61777 
62223 62000 

/125951 

2213 nex = 



60708 
60920 
61074 
61491 
61688 
61915 
62223 
62668 



60456 
60814 
61009 
61410 
61644 
61777 
62000 
62430 



>2264311 
len = 



1880 nex ■■ 



Term 
Intr 
Intr 
Intr 
Intr 
Intr 
Init 



82920 
83150 
83482 
83616 
83788 
83928 
84280 



82401 
83009 
83243 
83581 
83708 
83871 
84011 



60 >2264312 



/14950 
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len - 

Sngl 

>2264312 

len = 

Term 
Intr 
Intr 
Intr 
Init 

>2264312 

len = 

Sngl 

>2264312 

len = 

Sngl 

>2264312 

len = 

Sngl 

>2264312 

len = 

>2264312 

len = 

Init 
Intr 
Intr 
Intr 
Term 

>2264312 

len = 

Init 
Intr 
Term 

>2264313 

len = 

Init 



881 nex = 
27808 26928 
795433 
1318 nex = 



41828 
42031 
42285 
42519 
42741 



41561 
41958 
42119 
42450 
42601 



/41937 
412 nex = 
46315 45915 
/13715 
505 nex = 
46419 45915 
/20908 
1588 nex = 
47047 45915 
/121153 
1599 nex = 

/21872 
19 99 nex = 



76178 
76875 
77349 
77680 
77884 



76439 
77278 
77609 
77802 
78176 



/40252 

92 9 nex = 

8129 8281 
8374 8529 
8834 9057 

/13012 

2530 nex = 

50735 51416 
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Intr 
Term 

>2264313 

len = 

Term 
Intr 
Intr 

Init 

>2264314 

len = 

Term 
Intr 
Intr 
Init 

>2264314 

len = 

Term 
Init 

>2264314 

len = 

Term 
Intr 
Intr 
Intr 
Intr 
Intr 
Init 

>2264314 

len = 
Sngl 
>2264314 

len = 
Sngl 
>2264314 

len = 
>2264314 

len = 



51723 52053 
52969 53262 

/156373 

1597 nex = 

56197 55946 

56442 56319 

57210 56988 

57542 57464 

78635 

1886 nex = 

10067 9103 

10250 10148 

10433 10340 

10988 10835 

/115644 

1259 nex = 

26540 26126 
27384 26837 

738996 

2313 nex = 



27833 
28049 
28349 
28813 
29046 
29175 
29838 



27526 
27984 
28278 
28492 
28886 
29131 
29580 



732785 
149 9 nex = 
41738 42167 
720245 
1429 nex = 
41738 42147 
75592 
1450 nex = 

713819 
1390 nex = 
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Sngl 
>2264314 
5 len = 

Sngl 
>2264314 

10 

len = 
Sngl 

15 >2264314 
len = 
Sngl 

20 

>2264314 
len = 
25 Sngl 
>2264314 
len = 

30 

35 

>2264314 

40 

>2264314 
len = 
45 >2264314 
len = 

50 
55 

>2264314 
len = 

60 



41738 42167 
/29726 
673 nex = 
46055 46727 
/41900 
567 nex = 
46131 46697 
/2462 
570 nex = 
46131 46700 
/16750 
585 nex = 
46131 46715 
/18232 
1571 nex = 



/9012 
1870 nex = 

77365 
1776 nex = 

/33Q59 
2811 nex = 



727647 
1370 nex = 



Term 48315 47879 

Intr 48456 48413 

Intr 48598 48541 

Intr 48919 48826 

Init 49449 49182 



Term 61633 61320 

Intr 61973 61823 

Intr 62227 62054 

Intr 62409 62320 

Intr 62646 62576 

Intr 63811 62772 

Init 64130 63836 



936 

+ 0 



+ 0 



+ 0 



+ 0 



+ 0 



5 

0 
0 
0 
0 
0 



0 



0 



0 
0 
0 
0 
0 
0 
0 
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Init 
Intr 
Term 

>2264315 

len = 

Term 
Intr 
Intr 
Init 
>2254315 



72212 72591 
72849 73086 
73196 73581 



2270 nex = 

26015 25438 
26141 26094 
27175 26240 
27707 27384 
729462 



len = 

Init 
Term 

>2264315 

len = 

Sngl 

>2264315 

len = 

Sngl 

>2264315 

len = 

Sngl 

>2264315 

len = 

Init 
Intr 
Intr 
Intr 
Intr 
Intr 
Intr 
Intr 
Intr 
Intr 
Intr 
Term 

>2264316 

len = 



1139 nex = 

45117 45873 
45961 46255 

/14965 

430 nex = 

47036 46610 

/114307 

464 nex = 

47105 46642 

/3363 

636 nex = 

47111 46476 

/41666 

2157 nex = 



59476 
59800 
60015 
60160 
60278 
60433 
60582 
60709 
60876 
61055 
61205 
61348 



59703 
59887 
60074 
60192 
60355 
60476 
60622 
60791 
60967 
61124 
61246 
61632 



/31759 
1810 nex 



Term 40887 40024 
60 Intr 41245 40976 
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Init 

>2264316 

len = 

Term 
Intr 
Intr 
Init 

>2264316 

len = 

Init 
Term 



41826 41375 

/4716 

1150 nex = 

48078 47771 

48347 48169 

48549 48448 

48918 48760 

735357 

3430 nex = 

4937 5508 
7116 8360 



>2264316 

Term 
Intr 
Intr 
Init 

>2264316 

len = 

Term 
Intr 
Intr 
Init 

>2264316 

len = 



1121 nex = 

50134 49841 

50452 50271 

50665 50567 

50961 50832 

725839 

1733 nex = 

52037 51717 

52799 52621 

52994 52893 

53449 53248 

75103 



Term 
Init 

>2264316 

len = 

Init 
Intr 
Term 

>2264316 

len = 

Init 
Intr 
Intr 
Intr 
Term 



56108 55749 
56314 56188 



70502 70609 

70687 70765 

71265 71619 

728686 

1761 nex = 

73159 73478 

73823 73864 

74151 74238 

74355 74436 

74532 74919 
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1316 nex ■■ 



Init 
Intr 
Intr 
Intr 
Intr 
Term 



75294 
75493 
75623 
75977 
76215 
76389 



75411 
75533 
75723 
76121 
76304 
76609 



>2264316 
len = 



940 



nex 



Init 
Intr 
Intr 
Term 



75623 75723 

75977 76121 

76215 76304 

76389 76430 



/27304 



1450 



nex 



Init 
Intr 
Intr 
Term 



10536 10865 

11094 11307 

11430 11575 

11678 11977 

/41386 



2230 



nex 



Init 
Intr 
Intr 
Intr 
Intr 
Intr 
Term 



18624 
19320 
19544 
19786 
19964 
20166 
20364 



18806 
19433 
19688 
19863 
20076 
20269 
20848 



1116 nex 



Term 
Intr 
Intr 
Intr 
Init 



39626 
39837 
39994 
40263 
40495 



39380 
39741 
39932 
40110 
40353 



2230 



Init 
Intr 
Intr 
Intr 



43041 
43615 
43820 
44029 



43121 
43732 
43927 
44153 
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Intr 
Intr 
Intr 
Term 



Term 
Intr 
Intr 
Intr 
Intr 
Intr 
Intr 
Intr 
Intr 
Intr 
Init 

>2264318 

len = 

Term 
Intr 
Intr 
Init 

>2264318 

len = 

Sngl 

>2264318 

len = 

Sngl 

>2264318 

len = 

Init 
Intr 
Intr 
Term 

>2264319 

len = 

Term 
Intr 
Init 



44256 44520 

44612 44680 

44773 44934 

45031 45269 

/3797 

3010 nex = 



14549 
14698 
14911 
15084 
15230 
15408 
15837 
16050 
16304 
16522 
17210 



14209 
14642 
14777 
15004 
15162 
15334 
15757 
15932 
16139 
16393 
16609 



/33231 

1510 nex = 

19006 18683 

19387 19102 

19635 19485 

20191 19835 

742276 

681 nex = 

24794 24114 

726752 

754 nex = 

6372 6627 

725855 

2410 nex = 

74093 74435 

74770 74907 

75288 75359 

75730 76502 

737985 



29497 28961 
29867 29820 
30001 29945 



60 >2264320 



736697 
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3070 



nex ■■ 



15 



Init 
Intr 
Intr 
Intr 
Intr 
Intr 
Intr 
Intr 
Intr 
Intr 
Intr 
Term 



nil A 

18511 
78827 
79001 
79159 
79302 
79602 
79848 
80000 
80127 
80327 
80513 



77917 
78735 
78886 
79047 
79212 
79479 
79754 
79913 
80047 
80233 
80405 
80843 



20 



>2264321 
len = 
Sngl 
>2264321 

2 5 len = 



2639 



nex ■■ 



Term 
Intr 
Intr 
Intr 
Intr 
Intr 
Intr 
Intr 
Init 



44138 
44545 
44723 
45000 
45164 
45369 
45565 
45725 
46523 



43885 
44459 
44638 
44910 
45079 
45261 
45519 
45656 
45984 



1845 nex ^ 



Term 
Intr 
Intr 
Intr 

45 Intr 
Intr 
Intr 
Init 

50 >2264321 



62897 
63027 
63329 
63461 
63720 
63853 
64024 
64452 

/226 

1214 n 



62608 
62981 
63108 
63409 
63555 
63812 
63940 
64119 



Init 
Intr 
Term 



64562 64824 
65227 65327 
65506 65775 



6 0 len = 



1179 nex = 



1 



Reference No. 2750-942P 



Sngl 

>2264367 

len = 

Sngl 

>2264367 

len = 

Init 
Intr 
Intr 
Term 

>2264367 
len = 
Sngl 

>2264367 

len = 

Term 
Intr 
Intr 
Init 

>2275194 

len = 

>2275194 

len = 

Sngl 

>2275194 

len = 

Sngl 

>2275194 

len = 

>2275194 

len = 

Sngl 

>2275194 



64502 65780 

/13226 

760 nex = 

17702 16945 

76280 

1721 nex = 

79635 80401 

80649 80739 

80875 81047 

81136 81355 

/14253 

394 nex = 

79694 80087 

72093 

1697 nex = 

81924 81450 

82092 82014 

82411 82172 

83146 82545 

/35109 

1541 nex = 

/20378 

540 nex = 

46427 45888 

76324 

564 nex = 

81129 81692 

795662 

5 50 nex = 

/21715 

339 nex = 

81340 81678 

/34414 
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len 



2177 



Init 
Intr 
rntr 
Intr 
Intr 
Term 



1529 
1807 
2195 
2406 
2616 
2789 



1687 
1877 
2314 
2524 
2697 
3076 



>2281081 
len = 



2064 



Init 
Intr 
Intr 
Intr 
Intr 
Term 



1529 
1807 
2195 
2406 
2616 
2789 



1687 
1877 
2314 
2524 
2697 
3017 



>2281081 



1179 nex ■ 



Init 
Intr 
Intr 
Term 



17994 
18570 
18757 
18973 



18277 
18617 
18836 
19172 



Term 
Init 



20892 20042 
21655 20980 



Sngl 
>2281081 



3043 nex 



Init 
Intr 
Intr 
Intr 
Intr 
Intr 
Intr 
Intr 
Intr 
Term 



41405 
41989 
42186 
42347 
42881 
43151 
43288 
43475 
43663 
44186 



41802 
42098 
42243 
42610 
43018 
43202 
43367 
43534 
43743 
44447 



>2281081 

60 



/117763 
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len = 


158 


nex = 


1 






Sngl 


44280 


44447 


+ 


0 


5 


>2281081 


737969 








len = 


1630 




2 






Init 


45636 


46252 


+ 


0 


10 




46437 


47256 


+ 


0 




>2281081 


797249 








len = 


1570 


nex = 


3 




15 














Init 


75474 


75567 


+ 


0 




Intr 


75664 


75773 


+ 


0 






76110 


76381 


+ 


0 


2 0 


>2288979 


730737 








len = 


1957 


nex = 


6 






Term 


23749 


23549 


- 


0 


25 


Intr 


24382 


24215 




0 




Intr 


24583 


24465 


- 


0 




Intr 


24734 


24673 




0 






24906 


24830 




0 




Init 


25505 


25278 




0 


30 














>2288979 


742038 








len = 


2417 


nex = 


6 




35 


Term 


26123 


25700 


- 


0 




Intr 


26352 


26213 




0 






26728 


26523 




0 




Intr 


27113 


27007 


_ 


0 




Intr 


27509 


27330 


_ 


0 


40 


Init 


28116 


27832 


- 


0 




>2288979 


75460 








len = 


1369 


nex = 


2 




45 














Term 


61213 


60939 


- 


0 




Init 


61831 


61648 




0 




>2288979 


731535 






50 














len = 


971 




1 






Sngl 


6953 


5983 


_ 


0 


55 


>2288979 


715927 








len = 


582 


nex = 


2 






Init 


83467 


83577 


+ 


0 


60 


Term 


83732 


84048 




0 
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>2288979 

len = 

Init 
Term 

>2288979 

len = 

Init 
Intr 
Term 

>2288979 

len = 

Init 
Intr 
Term 

>2288979 

len = 

Init 
Term 

>2288979 

len = 

Init 
Term 

>2288979 
len = 
Sngl 

>2288979 

len = 

Init 
Term 

>2288979 

len = 



Init 
Term 



/14769 

598 nex = 

84968 85307 
85318 85559 

/22360 

621 nex = 

84968 85076 
85239 85307 
85318 85582 

/8155 

63 7 nex = 

84968 85076 
85239 85307 
85318 85598 

/91704 

6 85 nex = 

85879 86099 
86237 86563 

/14241 

5 94 nex = 

85971 86099 
86237 86564 

/27364 
59 3 nex = 
87277 87869 

/16079 

558 nex = 

88140 88265 
88343 88697 

/85 

670 nex = 

88140 88265 
88343 88809 
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510 nex 



Term 
Init 

>2326340 

len = 

Init 
Intr 
Term 

>2335089 

len = 

Init 
Term 

>2335089 

len = 

Init 
Intr 
Intr 
Intr 
Intr 
Intr 
Term 

>2337888 

len = 

Sngl 

>2337888 

len = 

Sngl 

>2337B88 

len = 

Term 
Intr 
Intr 
Init 

>2337888 

len = 



89987 89856 
90375 90255 

/17730 

938 nex = 

12848 12929 
13222 13294 
13456 13785 

/17415 

1711 nex = 

18997 19833 
20359 20707 

/41462 

2561 nex = 

77553 77859 

78200 78282 

78527 78615 

78796 78869 

78950 79000 

79347 79408 

79492 80113 

/30632 

597 nex = 

45399 44803 

/33132 

1427 nex = 

56372 54946 

/25271 

1190 nex = 

81979 81460 

82251 82069 

82443 82348 

82649 82529 

736364 

2473 nex = 



Term 81979 81474 
Intr 82251 82069 
60 Intr 82443 82348 
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Intr 
Intr 
Intr 
Intr 

5 Intr 
Intr 
Init 

>2337888 

10 

len = 
Sngl 

15 >2337888 
len = 
Init 

2 0 Term 

>2341023 
len = 

25 

Term 
Intr 
Intr 
Intr 

3 0 Intr 

Intr 
Intr 
Init 

35 >2341023 
len = 
Term 

4 0 Intr 

Init 

>2341023 

4 5 len = 

Term 
Intr 
Init 

50 

>2341023 

len = 

55 Term 
Init 

>2341023 

60 len = 



947 



82588 82529 - 0 

82726 82673 - 0 

82906 82829 - 0 

83042 82989 - 0 

83230 83147 - 0 

83655 83627 - 0 

83946 83768 - 0 

/48 

139 nex = 1 

84105 83967 - 0 

/39291 

1330 nex = 2 

9724 10277 + 0 

10380 11048 + 0 

/20848 

2394 nex = 8 

105381 105134 - 0 

105770 105531 - 0 

106011 105948 - 0 

106356 106242 - 0 

106669 106531 - 0 

106971 106841 - 0 

107209 107080 - 0 

107527 107476 - 0 

/4513 

1287 nex = 3 

16071 15822 - 0 

16960 16676 - 0 

17108 17082 - 0 

726558 

1150 nex = 3 

23857 23331 - 0 

24045 23945 - 0 

24472 24392 - 0 

723398 

2892 nex = 2 

36567 36137 - 0 

39028 38927 - 0 



740467 
2 2 02 nex = 7 
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Init 
Intr 
Intr 
Intr 
Intr 
Intr 
Term 

>2341023 

len = 

Init 
Intr 
Intr 
Term 

>2341023 

len = 

Init 
Term 

>2341023 

len = 

Init 
Intr 
Term 

>2341023 
len = 
Sngl 

>2341023 

len = 

Init 
Intr 
Term 

>2341023 

len = 

Sngl 

>2341023 

len = 

Sngl 



41815 41979 

42299 42457 

42564 42739 

42897 43174 

43264 43399 

43492 43603 

43692 44016 

/19832 

2656 nex = 

45198 45615 

45720 45944 

46040 46752 

46898 47342 

/91880 

1118 nex = 

46306 46752 
46898 47423 

78374 

805 nex = 

84788 85031 
85113 85256 
85340 85592 

/9471 

64 9 nex = 

85423 85236 

/30909 

1020 nex = 

90351 90483 
90571 90628 
91104 91353 

/28606 

730 nex = 

91839 92568 

/125151 

310 nex = 

96904 96600 



60 >2341023 



/33613 
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Term 
Intr 
Intr 
Intr 
Init 



94901 94658 

95464 95403 

95744 95606 

96270 96059 

96946 96584 



>2342673 
len = 



/21644 



Init 
Term 



Sngl 
>2342673 



15499 14469 
/13218 



Sngl 
>2342673 



len = 

Term 
Intr 
Intr 
Intr 
Intr 
Intr 
Intr 
Intr 
Intr 
Intr 
Intr 
Intr 
Intr 
Intr 
Init 



1410 nex 
/15745 



2693 

72951 
73173 
73327 
73473 
73651 
73809 
73936 
74109 
74283 
74471 
74618 
74789 
74956 
75176 
75290 



nex = 

72598 
73059 
73268 
73420 
73592 
73747 
73893 
74025 
74203 
74379 
74554 
74714 
74891 
75051 
75255 



/20814 
26 69 nex 



Term 
Intr 
Intr 
Intr 



87698 87414 

87906 87792 

88057 87998 

88219 88166 
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Intr 
Intr 
Intr 
Intr 
Intr 
Intr 
Intr 
Intr 
Intr 
Init 



88375 
88529 
88664 
88853 
89044 
89241 
89408 
89583 
89751 
89916 



88316 
88467 
88621 
88769 
88964 
89149 
89344 
89508 
89686 
89851 



len 



3206 nex ■ 



Term 
Intr 
Intr 
Intr 
Intr 
Intr 
Intr 
Intr 
Intr 
Intr 
Intr 
Intr 
Intr 
Intr 
Intr 
Init 



87698 
87906 
88057 
88219 
88375 
88529 
88664 
88853 
89044 
89241 
89408 
89583 
89751 
89916 
90281 
90727 



87522 
87792 
87998 
88166 
88316 
88467 
88621 
88769 
88964 
89149 
89344 
89508 
89686 
89851 
90192 
90584 



>2342673 
35 len = 



Init 
Term 



95406 95717 
95822 96232 



40 >2342717 
len = 



Term 
Intr 
Intr 
Intr 
Intr 
Intr 
Intr 
Intr 
Intr 
Intr 
Intr 
Intr 
Intr 
Intr 
Intr 
Init 



28916 
29102 
29276 
29479 
29760 
29937 
30204 
30570 
30730 
31414 
31587 
32170 
32332 
32516 
32772 
33012 



28495 
29002 
29211 
29365 
29654 
29848 
30094 
30505 
30665 
31265 
31513 
32079 
32267 
32417 
32611 
32912 
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>2342717 

len = 

Term 
Intr 
Intr 
Init 

>2342717 

len = 

Term 
Intr 
Intr 
Intr 
Init 

>2351061 

len = 

Term 
Intr 
Intr 
Init 

>2351061 

len = 



Init 
Term 

>2351061 

len = 

Init 
Intr 
Intr 
Intr 
Term 

>2351061 
len = 
Sngl 

>2351061 
len = 

>2351062 
len = 



/23892 

1550 nex = 

33902 33442 

34398 34340 

34564 34485 

34991 34651 

/25519 

2805 nex = 

38674 38181 

38927 38769 

39218 39037 

40474 40303 

40985 40560 

/36048 

2257 nex = 

36654 36150 

37353 37320 

37883 37644 

38406 38255 

/16286 

1302 nex = 

60023 60178 
60434 60780 

/25119 

2152 nex = 

72312 72460 

72978 73443 

73577 73670 

73763 73893 

74106 74463 

/7022 
1348 nex = 
74769 74513 

/37512 
17 3 7 nex = 

/1575 
1492 nex = 



Init 



11143 11366 
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Term 11952 12270 
>2351062 /38092 
5 len = 2470 nex = 



Term 
Intr 
Init 

>2351062 

len = 

Init 
Intr 
Term 

>2351052 

len = 

Init 
Intr 
Intr 
Intr 
Intr 
Intr 
Intr 
Term 



27085 26904 
28828 27521 
29365 29247 

/17241 

1404 nex = 

29965 30040 
30233 30463 
30712 30955 

/31041 

2710 nex = 



50901 
51563 
51779 
52010 
52264 
52687 
52881 
53072 



51179 
51664 
51832 
52102 
52356 
52791 
52979 
53603 





len = 


1277 


nex = 


3 




35 














Init 


71481 


71998 


+ 


0 




Intr 


72070 


72397 


+ 


0 




Term 


72483 


72757 


+ 


0 


40 


>2351063 


/114691 








len = 


1789 


nex = 


8 






Term 


20785 


20575 




0 


45 


Intr 


20954 


20889 




0 




Intr 


21132 


21047 




0 




Intr 


21269 


21235 




0 




Intr 


21455 


21369 




0 




Intr 


21616 


21539 




0 


50 


Intr 


21741 


21701 




0 




Init 


22363 


22239 




0 




>2351063 


/36626 






55 


len = 


1476 


nex = 


6 






Term 


21132 


21053 




0 




Intr 


21269 


21235 




0 




Intr 


21455 


21369 




0 


60 


Intr 


21616 


21539 




0 



Reference No. 2750-942P 



Intr 21741 21701 
Init 22528 22239 



len = 1211 nex = 

Init 28196 28319 

Intr 28394 28464 

Intr 28552 28573 

Term 28658 29015 



>2351063 

len = 

Init 
Intr 
Intr 
Term 



Init 
Intr 
Intr 
Intr 
Intr 
Intr 
Intr 
Intr 
Intr 
Term 

>2351063 

len = 

Sngl 

>2351063 

len = 

Sngl 

>2351G63 

len = 

Term 
Intr 
Intr 
Intr 
Intr 
Intr 
Intr 
Init 



/103246 

1195 nex = 

28196 28319 

28394 28464 

28552 28573 

28658 29015 

/36058 



2835 

55242 
55634 
55825 
56186 
56488 
56694 
56864 
57238 
57635 
57871 



55559 
55699 
55890 
56264 
56608 
56789 
56976 
57354 
57735 
58076 



/95281 
430 nex = 
58996 59416 
/108981 
314 nex = 
62819 63132 
/19716 



2088 

66456 
66816 
67192 
67350 
67560 
67709 
67857 
68262 



nex = 

66175 
66527 
66895 
67280 
67444 
67635 
67796 
68028 
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len = 


3071 


nex = 


5 




5 










0 




Term 


81242 


80797 


- 




Intr 


81474 


81378 


- 


0 




Intr 


81610 


81555 


- 


0 




Intr 


81979 


81686 


- 


0 


10 


Init 


82808 


82071 


- 


0 




>2351064 


/10154 








len = 


1092 


nex = 


5 




15 














Init 


30525 


30610 


+ 


0 




Intr 


30871 


30941 


+ 


0 




Intr 


31032 


31188 


+ 


0 




Intr 


31364 


31450 


+ 


0 


20 


Term 


31536 


31617 


+ 


0 




>2351064 


/23922 








len = 


2156 


nex = 


9 




25 














Init 


30531 


30610 


+ 


0 




Intr 


30871 


30941 








Intr 


31032 


31188 


+ 


0 




Intr 


31364 


31450 






30 


Intr 


31536 


31687 


+ 


0 




Intr 


31802 


31882 


+ 


0 




Intr 


31983 


32091 








Intr 


32233 


32359 


+ 


0 




Term 


32454 


32686 






35 














>2351064 


/41054 








len = 


500 


nex = 


2 




40 


Init 


32229 


32359 


+ 


0 




Term 


32454 


32728 


+ 


0 




>2351064 


/37122 






45 


len = 


2271 


nex = 


5 






Term 


52016 


51678 


- 


0 




Intr 


52304 


52104 




0 




Intr 


52616 


52417 




0 


50 


Intr 


52811 


52698 




0 




Init 


53187 


53050 




0 




>2351065 


/8508 






55 


len = 


286 


nex = 


1 






Sngl 


1156 


871 




c 



>2351065 

60 



729363 



Reference No. 2750-942P 



len = 8125 nex = 

Term 5274 4953 

Intr 12650 5804 

5 Init 13070 12743 

>2351065 /3542 

len = 1606 nex = 

10 

Term 12650 12382 

Intr 13557 12743 

Init 13987 13679 

15 >2351065 /117588 

len = 1433 nex = 

Init 26825 26985 

20 intr 27076 27149 

Term 27414 28257 

>2351065 /15229 

25 len = 1952 nex = 

Term 28953 28676 

intr 29086 29035 

intr 29404 29169 

30 Intr 29662 29605 

intr 29821 29753 

Intr 30022 29914 

Intr 30232 30165 

Intr 30434 30315 

35 Init 30627 30561 

>2351065 /41047 

len = 1956 nex = 

40 

Term 28953 28675 

Intr 29086 29035 

Intr 29404 29169 

Intr 29662 29605 

45 Intr 29821 29753 

Intr 30022 29914 

Intr 30232 30165 

Intr 30434 30315 

Init 30630 30561 

50 

>2351065 /105944 

len = 2 54 nex = 

55 Sngl 38996 38743 

>2351065 /6823 



len = 436 nex = 

60 



Reference No. 2750-942P 



Sngl 420 855 

>2351065 /15640 

5 len = 2139 nex = 

Term 54303 53997 

Intr 54528 54415 

Intr 54773 54648 

10 Intr 55027 54948 

Intr 55198 55117 

Intr 55390 55316 

Init 56135 55791 

15 >2351065 /633 

len = 529 nex = 

Sngl 56522 57050 

20 

>2351065 /104017 

len = 1017 nex = 

25 Term 62259 61832 

Init 62503 62277 

>2351066 /92216 

30 len = 1063 nex = 

Sngl 2252 1951 

>2351066 /18332 

35 

len = 372 nex = 

Sngl 51067 50696 

40 >2351066 /19255 

len = 1001 nex = 

Init 6275 6505 

45 Term 6677 6809 

>2351066 /93148 

len = 557 nex = 

50 

Sngl 64963 64407 

>2351066 79184 

55 len = 14 9 3 nex = 

Init 65437 65484 

Intr 65563 65622 

Term 66328 66800 

60 



Reference No. 2750-942P 



len = 

5 Term 
Intr 
Intr 
Init 

10 >2351066 
len = 
Term 

15 Intr 
Intr 
Intr 
Init 

20 >2351067 
len = 
Sngl 

25 

>2351067 
len = 
3 0 Sngl 
>2351067 
len = 

35 

Init 
Term 

>2351067 

40 

len = 

Term 
Intr 

45 Intr 
Init 

>2351067 

50 len = 

Term 
Intr 
Intr 

55 Init 
>2351067 
len = 

60 



794924 

7 72 nex = 

66989 66757 

67176 67069 

67314 67274 

67528 67424 

/117503 

1270 nex = 

82968 82813 

83338 83123 

83553 83453 

83928 83699 

84064 83998 

/24137 

592 nex = 

23773 23182 

/102435 

1553 nex = 

31407 31589 

/42506 

913 nex = 

3624 3761 
4109 4536 

/37503 

1398 nex = 

39519 39286 

39736 39638 

40371 40283 

40683 40599 

/23800 

1450 nex == 

39519 39294 

39736 39638 

40371 40283 

40735 40599 

/12458 

192 nex = 



Reference No. 2750-942P 



Sngl 

>2351068 

len = 

Init 
Intr 
Intr 
Term 



43705 43896 
/108814 
755 nex = 



14299 
14508 
14771 
14906 



14392 
14644 
14817 
15053 



2311 nex = 



Init 
Intr 
Intr 
Intr 
Intr 
Intr 
Intr 
Intr 
Term 

>2351068 

len = 

Init 
Intr 
Intr 
Intr 
Intr 
Intr 
Intr 
Intr 
Term 

>2351068 

len = 

Sngl 

>2351068 

len = 

Sngl 

>235106B 

len = 

Sngl 

>2351068 



14309 
14508 
14771 
14906 
15511 
15693 
15855 
16102 
16357 



2272 

14347 
14508 
14771 
14906 
15511 
15693 
15855 
16102 
16357 



14392 
14644 
14817 
15231 
15593 
15768 
16012 
16263 
16619 



nex = 

14392 
14644 
14817 
15231 
15593 
15768 
16012 
16263 
16618 



/777 
540 nex = 
22901 23440 
/2304 
550 nex = 
22901 23442 
/15211 
560 nex = 
22904 23463 
727372 



60 len = 1870 nex = 



3 



Reference No. 2750-942P 



Init 
Intr 
Term 



42505 42957 
43205 43414 
43963 44371 



>2351068 
len = 



/5335 
731 nex 



Init 
Intr 
Term 



5218 5304 
5320 5477 
5551 5931 



>2351G68 
len = 
Sngl 

>2351068 
len = 



2200 



nex ■■ 



Init 
Intr 
Intr 
Intr 
Intr 
Intr 
Term 



65723 
66035 
66298 
66544 
66874 
67153 
67680 



65950 
66198 
66349 
66771 
67063 
67418 
67922 



>2351068 
len = 
Sngl 

>2351069 

len = 

Init 
Intr 
Intr 
Intr 
Intr 
Intr 
Term 

>2351069 

len = 



2002 nex ^ 



26231 
26762 
26960 
27209 
27450 
27686 
27886 



3480 



26670 
26870 
27122 
27357 
27601 
27800 
28232 



nex 



Init 
Intr 
Intr 
Intr 
Intr 
Intr 
Intr 



42775 
43235 
43517 
43791 
44014 
44277 
44852 



42864 
43369 
43633 
43942 
44098 
44371 
45017 



Reference No. 2750-942P 



Intr 
Term 



45150 
45434 



45345 
45819 



Init 
Intr 
Term 



62856 62885 
62964 63042 
63127 63557 



1618 nex 



Term 
Intr 
Intr 
Intr 

20 Intr 
Intr 
Intr 
Init 

25 >2351069 



Term 
Intr 
Intr 
Intr 
Intr 
Intr 
Intr 
Intr 
Intr 
Intr 
Intr 
Intr 
Init 

>2351070 

len = 

Sngl 

>2351070 

len = 

Sngl 

>2351070 

len = 



67305 
67508 
67723 
67896 
68098 
68261 
68427 
68562 



66948 
67411 
67598 
67813 
67982 
68178 
68380 
68508 



3163 nex 



67262 
67508 
67723 
67896 
68098 
68261 
68427 
68562 
68759 
68928 
69102 
69415 
70098 



67061 
67411 
67598 
67813 
67982 
68178 
68380 
68508 
68704 
68844 
69029 
69349 
70008 



/97197 
6 97 nex = 
23957 23261 
/6363 
560 nex = 
34956 34397 
/26053 
817 nex = 



Sngl 46123 46936 

60 



0 



Reference No. 2750-942P 



>2351071 

len = 

Term 
Intr 
Intr 
Intr 
Intr 
Intr 
Intr 
Intr 
Init 

>235107I 

len = 

Term 
Intr 
Init 

>2351071 

len = 

Term 
Intr 
Init 

>2351071 

len = 

Term 
Intr 
Init 

>2351072 

len = 

Term 
Intr 
Intr 
Intr 
Init 

>2351072 
len = 
Sngl 

>2351073 
len = 



2313 

46885 
47174 
47356 
47556 
47720 
47910 
48093 
48436 
48898 



46586 
47088 
47291 
47467 
47640 
47833 
48003 
48295 
48628 



70730 70227 
71606 71158 
72412 72145 

/17360 

1402 nex = 

78193 77927 
78535 78274 
79311 79168 

/26743 

14 6 6 nex = 

78193 77927 
78535 78274 
79392 79168 

729659 

2508 nex = 



22869 
23128 
23667 
23978 
24786 



22279 
23019 
23238 
23838 
24671 



/207148 
797 nex = 
50991 50195 
/98326 
676 nex = 



Term 19588 19334 
60 Intr 19757 19681 



Reference No. 2750-942P 





Init 


19996 


19838 


- 


0 




>2351073 


/100141 






5 




1717 


nex = 










19588 


19293 








Intr 


19757 


19681 


- 


0 




Intr 


20220 


19838 






10 


Intr 


20633 


20533 


- 


0 




Init 


21009 


20902 








>2351073 


/115914 






15 


len = 


116 


nex = 








Sngl 


26710 


26595 


- 


0 




>2351073 


795599 






20 














len = 


749 


nex = 








Term 


26967 


26608 


- 


0 




Intr 


27178 


27047 


- 


0 


25 


Init 


27356 


27258 


- 


0 




>2351073 


735552 








len = 


1828 




6 




30 














Term 


26967 


26653 


- 


0 




Intr 


27178 


27047 








Intr 


27399 


27258 


- 


0 




Intr 


27742 


27550 






35 


Intr 


28087 


27842 


- 


0 




Init 


28480 


28170 








>2351073 


7118777 






40 


len = 


1030 


nex = 








Sngl 


31871 


32900 


+ 


0 




>2358139 


720380 






45 














len = 


876 


nex = 








Init 


15794 


15936 


+ 


0 




Intr 


16035 


16176 


+ 


0 


50 


Term 


16428 


16669 


+ 


0 




>2358139 


729808 








len = 


1270 


nex = 


2 




55 














Term 


64249 


63873 




0 




Init 


65100 


64760 




0 



>2358139 

60 



7108558 



Reference No. 2750-942P 



Init 
Intr 
Term 

>2358139 

Init 
Intr 
Term 

>2392762 

len = 

Term 
Init 

>2392762 

len = 

Term 
Intr 
Intr 
Intr 
Intr 
Intr 
Intr 
Init 

>2392762 

len = 

Term 
Intr 
Intr 
Intr 
Intr 
Intr 
Intr 
Init 

>2392762 

len = 

Init 
Intr 
Term 

>2435510 

len = 



65271 65413 
65781 65860 
66116 66339 



71725 71848 
72291 72590 
72701 73208 



30586 29909 
31167 30868 



1796 

60877 
61051 
61293 
61514 
61620 
61952 
62107 
62416 



60621 
60973 
61140 
61420 
61585 
61727 
62037 
62342 



1729 nex 



60877 
61051 
61293 
61514 
61620 
61952 
62107 
62416 



60688 
60973 
61140 
61420 
61585 
61727 
62037 
62342 



/41162 

951 nex = 

68249 68350 
68449 68513 
68901 69199 

732833 

1450 nex = 



60 



Term 



41015 40654 



0 



Reference No. 2750-942P 



Intr 
Intr 
Intr 
Init 



41265 41098 

41451 41368 

41718 41540 

42097 41892 



>2435510 
len = 



/lOll 
112 0 nex = 



Term 
Intr 
Init 



51801 51490 
52028 51949 
52609 52122 



>2435510 
len = 



/19362 
1041 nex = 



Init 
Intr 
Term 



61031 61254 
61359 61535 
61610 62071 



>2435510 
len = 



/142314 
919 nex ■ 



Init 
Intr 
Term 



61151 61254 
61359 61535 
61610 62069 



>2435510 



733456 
2142 nex ■■ 



Term 
Intr 
Intr 
Intr 
Intr 
Init 



4364 
4676 
5214 
5423 
5600 
6192 



4051 
4612 
5151 
5314 
5513 
5794 



>2435510 
len = 



2039 nex = 



Init 
Intr 
Intr 
Intr 
Intr 
Intr 
Intr 
Term 



76018 
76377 
76648 
76793 
77335 
77587 
77749 
77912 



76119 
76574 
76707 
77235 
77501 
77660 
77808 
78053 



>2443899 
len = 



/22008 
1489 nex 



Term 102074 101797 
Init 103282 102296 

60 



Reference No. 2750-942P 



965 

>2443899 /1734 



len = 888 nex = 2 

5 Term 14747 14318 - 0 

Init 15205 14965 - 0 

>2459406 742992 

10 len = 2396 nex - 10 

Term 117911 117825 - 0 

Intr 118071 117986 - 0 

Intr 118340 118166 - 0 

15 Intr 118518 118458 - 0 

Intr 118661 118595 - 0 

Intr 118838 118754 - 0 

Intr 119077 118920 - 0 

Intr 119310 119166 - 0 

20 Intr 119486 119427 - 0 

Init 119855 119575 - 0 

>2459406 /11254 

25 len = 2035 nex = 6 

Init 128392 128598 + 0 

Intr 128894 129063 + 0 

Intr 129142 129327 + 0 

30 Intr 129412 129577 + 0 

Intr 129681 129870 + 0 

Term 130089 130426 + 0 

>2459406 /92741 

35 

len = 538 nex = 1 

Sngl 141230 140693 - 0 

40 >2459406 /13741 

len = 1713 nex = 4 

Term 18475 18146 - 0 

45 Intr 18628 18567 - 0 

Intr 19123 18713 - 0 

Init 19858 19394 - 0 

>2459406 725272 

50 

len = 1750 nex = 4 

Init 2679 2985 + 0 

Intr 3377 3419 + 0 

55 Intr 3511 3571 + 0 

Term 3697 4419 + 0 

>2459406 735273 

60 len = 2218 nex = 3 



Reference No. 2750-942P 



Term 
intr 
Init 

>2459406 

len = 

Term 
Intr 
Intr 
Init 

>2459406 

len = 

Sngl 

>2459406 

len = 

Sngl 

>2459405 

len = 

Init 
Intr 
Intr 
Term 

>2459406 

len = 

Init 
Term 

>2459406 

len = 

Term 
Intr 
Intr 
Intr 
Intr 
Init 



26889 26777 
28208 27837 
28994 28459 

728563 

1150 nex = 

47656 47428 

47792 47751 

48158 47874 

48577 48488 

/119409 

468 nex = 

57470 57023 

/116034 

33 7 nex = 

61222 61558 

/8717 

2113 nex = 

66546 66940 

67084 67181 

67274 67339 

68443 68658 

/31633 

945 nex = 

77435 77674 
78004 78379 

/19302 

2115 nex = 

80490 80306 

80717 80586 

80949 80814 

81174 81044 

81479 81424 

82420 82270 



>2459406 
len = 



/37919 
2274 nex 



Term 80490 80262 
Intr 80717 80586 
Intr 80949 80814 



Reference No. 2750-942P 



Intr 
intr 
init 

>2459406 

len = 

Sngl 

>2477521 

len = 

Sngl 

>2477521 

len = 



81174 81044 
81479 81424 
82535 82270 

/18894 

235 nex = 

85070 85304 

/15308 

1434 nex = 

11192 12625 

/27205 

7 60 nex = 



Term 
Intr 
Init 



22663 22447 
22864 22743 
23206 22955 



>2477521 
len = 



/40049 
3210 nex ■ 



Init 
Intr 
Intr 
Intr 
Term 



52491 52536 

52618 52732 

52824 52891 

52986 53708 

53792 54336 



>2477521 

len = 

Init 
Intr 
Intr 
Term 

>2477521 

len = 

Term 
Intr 
Intr 
Intr 
Intr 
Init 



73549 
1750 ne 



59783 
60329 
60773 
60979 



60056 
60677 
60914 
61527 



712293 
17 96 nex ■ 



71123 
71380 
71502 
71702 
72024 
72431 



70636 
71205 
71478 
71620 
71951 
72108 



>2477521 
len = 



798850 
4 463 nex ■■ 



Init 74583 74814 
60 Intr 77407 77441 



Reference No. 2750-942P 





77533 


77614 




0 


Intr 


77696 


77795 


+ 


0 




77904 


77945 






Intr 


78281 


78322 


+ 


0 


Term 


78695 


79045 


+ 


0 


>2477521 


792459 






len = 


4460 


nex = 


7 






74588 


74814 






Intr 


77285 


77342 


+ 


0 




77553 


77614 






Intr 


77696 


77795 


+ 


0 




77904 


77945 






Intr 


78281 


78322 


+ 


0 


Term 


78695 


79047 


+ 


0 




/5076 








730 


nex = 








79591 


79372 






Intr 


79924 


79697 


- 


0 


Init 


80096 


80042 


- 


0 


>2477521 


/4033 






len = 


1930 




7 




Init 


94403 


94493 






Intr 


94625 


94761 


+ 


0 




94865 


94911 






Intr 


94999 


95483 


+ 


0 


Intr 


95570 


95727 






Intr 


95814 


95975 


+ 


0 


Term 


96051 


96327 


+ 


0 


>2494 106 


/36412 








1375 


nex = 








99606 


98923 






Intr 


100124 


99692 


- 


0 


Init 


100297 


100214 




0 


>2494106 


/11408 






len = 


644 


nex = 


1 




Sngl 


109531 


110174 




0 



910 nex = 
112974 112773 



>2494106 

60 



/37020 



Reference No. 2750-942P 



len = 757 nex = 

Term 122980 122712 

Intr 123133 123078 

5 Intr 123278 123220 

Init 123468 123370 

>2494106 729872 

10 len = 861 nex = 

Term 122980 122712 

Init 123133 123078 

15 >2494106 734434 

len = 86 6 nex = 

Term 122980 122714 

20 Intr 123133 123078 

Intr 123278 123220 

Init 123577 123370 

>2494106 734374 

25 

len = 359 nex = 

Term 123278 123219 

Init 123577 123370 

30 

>2494106 75465 

len = 2050 nex = 

35 Init 132597 132734 

Intr 133129 133207 

Intr 133336 133389 

intr 133680 133793 

Intr 134040 134107 

40 Intr 134190 134301 

Term 134381 134640 

>2494106 71520 

45 len = 1810 nex = 

Init 132677 133207 

Intr 133336 133389 

Intr 133680 133793 

50 intr 134040 134107 

Intr 134190 134301 

Term 134381 134477 

>2494106 72681 

55 

len = 910 nex = 

Sngl 143514 143911 



60 >2494106 733770 



Reference No. 2750-942P 



len = 1302 nex = 

Term 158712 158351 
5 Intr 159059 158976 

Intr 159236 159156 
init 159509 159332 

>2494106 727457 

10 

len = 14 62 nex = 

Term 40898 40547 

Intr 41137 41003 

15 Intr 41443 41231 

Init 42008 41526 

>2494106 725255 

2 0 len = 1719 nex = 

Init 54004 54063 

Intr 54151 54486 

Term 54639 54877 

25 

>2494106 714939 

len = 610 nex = 

30 Init 54277 54486 

Term 54639 54879 

>2494106 732130 

35 len = 3130 nex = 

Term 56042 55686 

Intr 56181 56114 

Intr 56328 56265 

Intr 56502 56421 

Intr 56676 56618 

Intr 56984 56925 

Intr 57266 57104 

Intr 57498 57374 

Intr 57857 57795 

Intr 58060 58001 

Intr 58325 58140 

Init 58811 58689 

50 >2494106 76667 

len = 1554 nex = 

Sngl 60644 59091 

55 

>2494106 725894 

len = 1630 nex = 

60 Term 64139 63599 



40 



45 



Reference No. 2750-942P 



Intr 
Intr 
init 

>2494110 

len = 

Term 
Intr 
Intr 
Intr 
Init 

>2494110 

len = 

Init 
Term 

>2494110 

len = 

Init 
Intr 
Intr 
Intr 
Intr 
Term 

>2494110 

len = 

Term 
Intr 
Init 

>2494110 

len = 

Term 
Intr 
Init 

>2494110 
len = 
Sngl 

>2494110 
len = 



64439 64381 
64965 64855 
65226 65138 

/23300 

2036 nex = 

17775 17469 

18041 17877 

18302 18159 

18618 18423 

19504 19053 

/8559 

1302 nex = 

25200 25402 
26210 26501 

737952 

4214 nex = 

25200 25402 

26210 26290 

27617 28259 

28358 28461 

28571 28709 

28803 29413 

/21100 

812 nex = 

30699 30410 
30921 30796 
31221 30993 

734753 

807 nex = 

30699 30415 
30921 30796 
31221 30993 

7110726 

493 nex = 
32194 32672 

72265 

494 nex = 



Sngl 38819 39312 

60 



0 



Reference No. 2750-942P 





>2494110 


/13232 








len = 


1220 


nex = 


1 




5 


Sngl 


40544 


39752 


_ 


0 




>2494110 


/31923 








len = 


1284 


nex = 


3 




10 














Init 


41985 


42310 


+ 


0 




Intr 


42859 


42930 


+ 


0 




Term 


43017 


43268 


+ 


0 


15 


>2494110 


/100984 








len 


1340NO 


match - 


No prediction 




>2494110 


/27110 






20 


len = 


108 


nex = 








1 

ng 


74373 


74480 


+ 


0 




>2494110 


/40608 






25 














len = 


1703 


nex = 


5 






Term 


91321 


90966 


- 


0 




Intr 


91466 


91405 




0 


30 


Intr 


91657 


91540 


- 


0 




Intr 


92025 


91739 




0 




Init 


92668 


92298 


- 


0 




>2 4 94110 


72935 






35 














len = 


1613 


nex = 








Init 


97175 


97627 


+ 


0 




Intr 


97725 


97897 


+ 


0 


40 


Intr 


97974 


98088 


+ 


0 






98324 


98478 




0 




erm 


98578 


98787 


+ 


0 




>2505864 


735333 






45 














len = 


1426 


nex = 


4 






Init 


20951 


21020 


+ 


0 




Intr 


21255 


21415 


+ 


0 






21681 


21869 


+ 


0 




^^^^ 
erm 


22136 


22367 


+ 


0 




>2505864 


74328 






55 


len = 


1374 


nex = 


4 






Init 


21013 


21070 


+ 


0 




Intr 


21255 


21415 


+ 


0 




Intr 


21681 


21869 


+ 


0 


60 


Term 


22136 


22386 


+ 


0 



Reference No. 2750-942P 



>2505873 

len = 

Term 
Init 

>2505873 

len = 

Sngl 

>2505873 

len = 

Sngl 

>2529657 

len = 

Term 
Intr 
Intr 
Intr 
Init 

>2529657 

Term 
Intr 
Intr 
Intr 
Init 

>2529657 

len = 

Term 
Intr 
Intr 
Intr 
Init 

>2529657 

len = 

Term 
Intr 
Intr 
Intr 
Intr 
Intr 



14088 13696 
15295 14419 



19922 20055 



27483 27718 

732457 

1232 nex = 

10325 10206 

10512 10408 

10777 10703 

11135 mil 

11437 11243 



/26123 

1422 nex = 

10325 10041 

10512 10408 

10777 10703 

11135 mil 

11462 11380 



/20647 



1390 



nex ■■ 



12109 11630 

12283 12185 

12499 12362 

12722 12592 

13015 12840 

728691 

2057 nex = 

12109 11913 

12283 12185 

12499 12362 

12722 12592 

12986 12840 

13647 13615 



Reference No. 2750-942P 



Ini-t 13969 13735 
>2529657 733373 
5 len = 2492 nex = 



Term 
Intr 
Intr 
Intr 
Intr 
Intr 
Intr 
Init 



12109 
12283 
12499 
12722 
12986 
13647 
13831 
14128 



11637 
12185 
12362 
12592 
12840 
13615 
13735 
13974 



1054 nex 



2 0 Term 

Intr 
Intr 
Init 

25 >2529657 
len = 
Term 

3 0 Intr 

Intr 
Intr 
Init 



17370 17243 

17555 17463 

17935 17637 

18296 18094 

76394 

187 0 nex = 

17370 16988 

17555 17463 

17935 17637 

18295 18094 

18459 18415 



35 >2529657 
len = 



725729 



Sngl 
>2529657 



3834 4635 
737870 



nex ■ 



Term 
Intr 
Intr 
Intr 
Intr 
Intr 
Intr 
Intr 
Intr 
Intr 
Intr 
Intr 
Intr 
Intr 
Intr 
Intr 



46706 
46947 
47087 
47280 
47466 
47623 
47773 
47950 
48158 
48324 
48463 
48638 
49052 
49302 
49575 
49795 



46424 
46882 
47058 
47182 
47371 
47573 
47707 
47856 
48077 
48275 
48413 
48540 
48969 
49192 
49426 
49678 



Reference No. 2750-942P 



init 50050 49884 

>2529657 /32039 

5 len = 670 nex = 

Sngl 63987 64654 

>2529657 /9499 

10 

len = 6 54 nex = 

Term 65131 64658 

Init 65297 65222 

15 

>2529657 /38461 

len = 2350 nex = 

20 Term 65131 64823 

Intr 65346 65222 

Intr 65588 65432 

Intr 65777 65686 

Intr 65890 65863 

25 Intr 66093 65976 

Intr 66394 66339 

Intr 66604 66507 

Intr 66777 66693 

Init 67165 66986 

30 

>2529657 /13774 

len = 2397 nex = 

35 Term 65131 64823 

Intr 65346 65222 

Intr 65588 65432 

Intr 65777 65686 

Intr 65890 65863 

40 Intr 66093 65976 

Intr 66394 66339 

Intr 66604 66507 

Intr 66777 66693 

Init 67219 66986 

45 

>2529657 /34914 

len = 7 17 nex = 

50 Sngl 75255 74539 

>2529657 /37980 

len = 1352 nex = 

55 

Sngl 75893 74542 

>2564044 /156017 

60 len = 401 nex = 



Reference No. 2750-942P 





Sngl 


12975 12575 


- 


0 




>2564044 


/156773 






5 












len = 


350 nex = 


1 






Sngl 


12997 12648 


- 


0 


10 


>2564044 


/31129 








len 


430 nex = 


1 






Snal 


13041 12616 


_ 


0 


15 












>2564044 


/21629 








len = 


1610 nex = 


5 




20 


Term 


36986 36739 


- 


0 




Intr 


37123 37068 




0 






37318 37272 




0 




Intr 


37669 37626 


_ 


0 




Init 


38348 38232 


_ 


0 


2 5 












>2564044 


/22860 








en 


3400 nex — 


11 




3 0 




5043 5315 


+ 


0 




Intr 


5670 5734 


+ 


0 




Intr 


5871 5969 


+ 


0 




Intr 


6171 6303 


+ 


0 




Intr 


6748 6807 


+ 


0 


35 


Intr 


6897 7019 


+ 


0 






7379 7450 


+ 


0 




Intr 


7562 7699 


+ 


0 




Intr 


7786 7941 


+ 


0 




Intr 


8028 8132 


+ 


0 


40 


Term 


8282 8442 


+ 


0 




>2564045 


/108335 








len = 


1516 nex = 


2 




45 












Term 


653 118 




0 




Init 


1633 770 


- 


0 




>2564045 


/512 






50 












len = 


1435 nex = 


1 






Sngl 


40196 39668 




0 


55 


>2564045 


/40250 








len = 


1210 nex = 


2 





Term 57008 56234 
60 Init 57441 57096 



Reference No. 2750-942P 



>2564045 /36090 



len = 1219 nex = 

5 

Term 57008 56234 

Init 57452 57096 



>2564045 /33763 

10 

len = 1217 nex 



Sngl 5886 7102 



15 >2564045 /23566 



len = 104 3 nex = 



Init 9042 9192 
20 Intr 9618 9763 

Term 9851 10084 



>2564046 /4272 



25 len = 4185 nex = 



Term 18249 17894 

Intr 18506 18454 

Intr 18683 18598 

30 Intr 18985 18867 

Intr 19502 19431 

Intr 19881 19708 

Intr 20444 20289 

Intr 20917 20836 

35 Intr 21276 21130 

Intr 21654 21468 

Init 22078 21842 



>2564046 /13993 

40 

len = 1672 nex 



Init 27089 27339 

Intr 27573 27725 

45 Intr 27820 27972 

Intr 28179 28262 

Intr 28344 28485 

Term 28581 28760 



50 >2564046 735683 



len = 69 7 nex = 



Term 34417 34208 
55 Intr 34609 34504 

Init 34904 34742 



>2564047 /12802 



60 len = 



1648 nex = 



Reference No. 2750-942P 



init 
Intr 
Term 



16402 16518 
17081 17129 
17663 17714 



>2564047 
len = 
Sngl 

>2564047 
len = 



Init 
Intr 
Term 



37480 37886 
37970 38637 
39199 39258 



>2564047 
len = 
Sngl 

>2564047 
len = 



51389 50111 
/13737 
1302 nex = 



Term 
Intr 
Intr 
Intr 
Init 

>2564047 



57880 
58070 
58297 
58633 
58969 



57668 
58011 
58197 
58398 
58725 



1309 nex ■ 



Term 
Intr 
Intr 
Intr 
Init 



57880 57662 

58070 58011 

58297 58197 

58633 58398 

58970 58725 



/114864 



24 7 0 nex 



Init 
Intr 
Intr 
Intr 
Intr 
Intr 
Intr 
Intr 
Term 



59318 
59652 
59821 
60508 
60854 
60996 
61178 
61298 
61566 



59464 
59723 
59895 
60588 
60923 
61087 
61219 
61378 
61785 



60 >2564047 



/105566 



Reference No. 2750-942P 



len = 
Sngl 

5 

>2564047 

len = 

1 0 Term 
Init 

>2564047 

15 len = 

Sngl 

>2564048 

len = 



20 



25 



Init 
Term 



>2564048 
len = 
3 0 Sngl 
>2554048 
len = 

35 

Term 
Intr 
Intr 
Init 

40 

>2564048 

len = 

45 Init 
Intr 
Intr 
Intr 
Intr 

5 0 Intr 
Term 

>2564048 

55 len = 

Init 
Intr 
Intr 

60 Intr 



97 9 nex = 

62070 63048 

/12455 

1933 nex = 

67046 66415 
68347 67916 

/40711 

850 nex = 

78369 77529 

/105906 

1212 nex = 

2380 2769 
2946 3591 

/115613 

586 nex = 

31514 30929 

/1200 

1778 nex = 

38609 37885 

38864 38681 

39244 38988 

39662 39331 

739462 

2351 nex = 

41518 41825 

42059 42268 

42387 42557 

42766 42889 

43155 43216 

43305 43386 

43481 43868 

/10292 

1951 nex = 

41667 41825 

42059 42268 

42387 42557 

42766 42889 



Reference No, 2750-942P 



Intr 43155 43216 
Intr 43305 43386 
Term 43481 43617 



1980 nex ■■ 



Init 
Intr 
Intr 
Intr 
Intr 
Intr 
Term 



61938 
62306 
62586 
62859 
63011 
63126 
63245 



62027 
62497 
62757 
62932 
63037 
63149 
63657 



169 0 nex ■ 



Init 
Intr 
Intr 
Intr 
Term 



65258 65519 

65699 65751 

65845 65980 

66115 66290 

66365 66942 



>2564049 
len = 



737294 



1427 nex ■ 



Term 
Init 



485 294 
1720 625 



>2564049 



/104793 
1873 nex 



Init 
Intr 
Intr 
Intr 
Intr 
Intr 
Term 



17973 
18663 
18882 
19112 
19304 
19521 
19790 



18128 
18789 
19035 
19208 
19392 
19589 
19845 



len ^ 



2068 nex ■ 



Init 
Intr 
Intr 
Intr 
Intr 
Intr 
Term 



18007 18128 

18663 18789 

18882 19035 

19112 19208 

19304 19392 

19521 19589 

19790 20074 



/21604 



60 len = 



557 nex = 



1 



Reference No. 2750-942P 



Sngl 

>2564049 

len = 

Init 
Intr 
Term 

>2564049 

len = 



28618 28062 

/16144 

1365 nex = 

28919 29348 
29603 29695 
30029 30283 



Init 
Term 

>2564049 

len = 

Term 
Intr 
Intr 
Intr 
Intr 
Init 

>2564050 

len = 

Init 
Intr 
Intr 
Intr 
Intr 
Intr 
Intr 
Intr 
Intr 
Intr 
Intr 
Intr 
Term 

>2564050 
len = 
Sngl 

>2564050 
len = 



35677 36089 
36890 37494 



1704 nex = 

5026 4812 

5207 5118 

5466 5299 

5691 5572 

5932 5787 

6515 6354 



/6203 



2839 nex 



12017 
12485 
12820 
13048 
13144 
13467 
13634 
13832 
14029 
14202 
14407 
14606 
14766 



12391 
12567 
12974 
13082 
13293 
13562 
13750 
13951 
14121 
14324 
14523 
14668 
14842 



/123496 
674 nex = 
17696 18369 
/16313 
1594 nex = 



Init 2671 2918 
Intr 3227 3325 
60 Intr 3410 3518 



Reference No. 2750-942P 



Intr 
Term 

>2564050 

len = 

Term 
Init 

>2564050 

len = 

Term 
Init 

>2564050 
len = 

>2564051 

len = 

Term 
Init 

>2564051 

len = 

Init 
Intr 
Intr 
Intr 
Intr 
Term 

>2564051 

len = 

Init 
Intr 
Intr 
Intr 
Intr 
Intr 
Intr 
Intr 
Term 

>2564051 

len = 



3687 3758 
3993 4264 

/14738 

104 0 nex = 

28103 27853 
28892 28606 

/13951 

106 3 nex = 

28103 27848 
28910 28606 

/38057 
2006 nex = 

/7688 

1722 nex = 

13311 12928 
13996 13887 



2570 



nex 



18254 18493 

18575 18754 

19785 19904 

19917 20078 

20178 20459 

20546 20823 



/30648 



2334 



nex ■ 



33401 33589 

33676 33848 

34149 34268 

34373 34429 

34595 34675 

34763 34797 

34933 35006 

35103 35262 

35380 35734 

/30994 



Init 45513 45608 
Intr 45036 46115 
60 Intr 46206 46280 



Reference No. 2750-942P 





Intr 


46370 


46473 


+ 




Intr 


46561 


46717 






Intr 


46810 


46897 






Intr 


46997 


47069 


+ 


5 




47147 


47224 


+ 




>2564051 


/29619 






len = 


942 


nex = 


3 


10 












Init 


46810 


46897 


+ 




Intr 


46997 


47069 








47147 


47224 




15 


>2564051 


729829 






len 


1317 


nex = 


3 




Term 


48114 


47710 




20 


Intr 


48493 


48207 






Init 


49026 


48809 


- 




>2564051 


/6519 




25 


len == 


1128 


nex = 


2 




init 


72721 


72978 






Term 


73194 


73848 


+ 


30 


>2564051 


/142033 






len - 


651 


nex = 






Init 


72788 


72978 




35 


Term 


73194 


73438 






>2564051 


/14159 






len = 


1394 


nex = 


5 


40 












Term 


74311 


74056 


- 






74603 


74398 






Intr 


74863 


74713 






intr 


75172 


74950 




45 


Init 


75449 


75412 






>256405 1 


/40866 






len = 


1519 


nex = 


5 


50 












Term 


74311 


74064 






Intr 


74603 


74398 






Intr 


74863 


74713 






Intr 


75172 


74950 




55 


Init 


75582 


75412 






>2564051 


/17770 






len = 


1500 


nex = 


5 



Reference No. 2750-942P 



Term 
Intr 
Intr 
Intr 
Init 



74311 74086 

74603 74398 

74863 74713 

75172 74950 

75445 75412 



Term 
Intr 
Init 

>2570223 

len = 



82879 82476 
83240 82973 
83585 83325 



2253 



Init 
Intr 
Intr 
Intr 
Intr 
Term 

>2570223 

len = 

Term 
Intr 
Intr 
Intr 
Intr 
Intr 
Intr 
Intr 
Init 

>2570223 
len = 
Sngl 

>2583106 
len = 



17162 
17799 
18430 
18688 
18887 
19185 



2869 

26477 
26840 
27159 
27498 
27878 
28077 
28258 
28478 
28847 



17477 
17892 
18609 
18807 
19020 
19414 



nex = 

25979 
26580 
26941 
27271 
27776 
27965 
28197 
28346 
28757 



/23106 
629 nex = 
74691 75319 
/29207 
2272 nex = 



Init 
Intr 
Intr 
Intr 
Term 



108141 108430 

108875 109071 

109540 109629 

109744 109815 

110152 110412 



>2583106 
len = 
Sngl 



736389 
643 nex = 
121587 120945 



Reference No. 2750-942P 



>2583106 
len = 
Sngl 

>2583106 

len = 

Init 
Term 

>2583106 
len = 
Sngl 

>2583106 

len = 

Term 
Intr 
Init 

>2583106 

len = 

Sngl 

>2583106 

len = 

Sngl 

>2583106 

len = 

Term 
Intr 
Intr 
Init 

>2583106 

len = 



/17187 
677 nex = 
121618 120942 

/23203 

704 nex = 

13956 14106 
14207 14659 

72322 

531 nex = 

15480 14950 

/26817 

1955 nex = 

15998 14950 
16202 16119 
16904 16559 

/7709 
1471 nex = 
3827 5297 

733864 
1700 nex = 
64128 64474 

727799 

1690 nex = 

73308 72358 

73553 73400 

73796 73648 

74047 73886 

715659 

2 84 8 nex = 



Term 


73308 


72358 




Intr 


73553 


73400 




Intr 


73796 


73648 




Intr 


74168 


73886 




Intr 


74356 


74251 




Intr 


74536 


74446 




Init 


75205 


74719 
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len 


2568 


nex = 




5 












Term 


73308 


72638 


- 




Intr 


73553 


73400 






Intr 


73796 


73648 






Intr 


74168 


73886 




10 


Intr 


74356 


74251 






Intr 


74536 


74446 






Init 


75205 


74719 


_ 




>2583106 


/21765 




15 












len = 


1874 




0 




>2583106 


/1969 




20 


len = 


5689 


nex = 


1 




Sngl 


88355 


88684 






>2583106 


/37127 




25 












en 


310 








>2583 106 


/37621 






len 


3101 


nex = 


15 




Term 


89198 


88862 






Intr 


89371 


89306 








89531 


89462 




35 


Intr 


89689 


89616 






Intr 


89891 


89793 


_ 




Intr 


90037 


89976 


- 




Intr 


90178 


90137 






Intr 


90316 


90265 


- 


40 


Intr 


90541 


90442 






Intr 


90682 


90638 


- 




Intr 


90843 


90796 






Intr 


91179 


91104 








91456 


91286 




45 


Intr 


91590 


91540 






Init 


91962 


91806 


_ 




>2584827 


/273 




50 


len - 


2260 


nex = 






Term 


101872 


101586 






Intr 


102093 


102017 






Intr 


102388 


102242 




55 


Intr 


102650 


102480 






Init 


103150 


102928 






>2584827 


/5480 




60 


len = 


1994 


nex = 


3 



Reference No. 2750-942P 



Term 111925 111310 

Intr 112167 112049 

Init 113303 112263 

>2584827 /5171 

len = 319 nex = 

Sngl 115114 114796 

>2584827 /17426 

len = 597 nex = 

Sngl 115422 114826 

>2584827 /11593 

len = 562 nex = 

Sngl 115422 114861 

>2584827 /25571 

len = 610 nex = 

Sngl 115430 114821 

>2584827 /34348 

len = 1756 nex = 

Term 117147 116843 

Intr 117385 117233 

Intr 117590 117483 

Intr 117734 117687 

Intr 118025 117813 

Intr 118181 118117 

Intr 118386 118262 

Init 118595 118482 

>2584827 /39107 

len = 2383 nex = 

Term 117147 116840 

Intr 117385 117233 

Intr 117590 117483 

Intr 117734 117687 

Intr 118025 117813 

Intr 118181 118117 

Init 118386 118262 

>2584827 /5712 

len = 790 nex = 

Sngl 23900 24682 



Reference No. 2750-942P 



>2584827 
len = 
Sngl 
>2584827 
len = 
Sngl 
>2584827 
len = 
Sngl 
>2584827 
len = 
Sngl 
>2584827 
len = 



727675 
745 nex = 
23978 24722 
/116395 
286 nex = 
29868 29583 
74503 
471 nex = 
30113 29549 
722292 
592 nex = 
30233 29642 
725064 
683 nex = 



Term 
Init 



30138 29660 
30342 30322 



>2584827 
len = 



725142 
816 nex ^ 



Term 
Init 



30138 29596 
30411 30322 



>2584827 
len = 



71994 
835 ne 



Term 
Init 

>2584827 

len = 

Term 
Intr 
Init 



30138 29584 

30418 30322 

74479 

655 nex = 

84398 84092 

84584 84486 

84746 84674 



>2584827 
len = 



731676 
64 9 nex = 



Term 
Init 



85483 85211 
85859 85582 



Reference No. 2750-942P 



5 Term 
Intr 
Intr 
Intr 
Intr 

10 Intr 
Init 

>2584827 

15 len = 

Init 
Intr 
Intr 

2 0 Intr 
Intr 
Intr 
Intr 
Intr 

2 5 Intr 
Intr 
Intr 
Term 

30 >2584827 



35 



Init 
Intr 
Intr 
Intr 
Intr 
Intr 

4 0 Intr 
Intr 
Term 



45 



>2618599 



len = 
Sngl 

50 >2618599 
len = 
Sngl 

55 

>2618599 
len = 
6 0 Term 



/32472 

1762 nex = 

84398 84113 

84584 84486 

84807 84674 

84997 84910 

85301 85255 

85483 85417 

85870 85582 

78972 

4044 nex = 



95183 
95429 
95608 
95804 
96059 
96231 
96387 
96601 
96783 
97037 
97247 
97422 



1881 

95871 
96059 
96231 
96387 
96601 
96783 
97037 
97247 
97422 



95243 
95523 
95720 
95972 
96098 
96295 
96500 
96665 
96939 
97156 
97335 
97750 



95972 
96098 
96295 
96500 
96665 
96939 
97156 
97335 
97751 



723293 
7 7 6 nex = 
11508 11343 
76500 
99 7 nex = 
13043 14039 
740212 
1750 nex = 
12611 12066 
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Intr 
Init 



13271 
13808 



13086 
13354 



>2618599 

len = 

Term 
Intr 
Intr 
Init 



/39514 

1757 nex = 

23461 23198 

24300 24198 

24625 24565 

24954 24707 



>2618599 
len = 



/8490 
1570 nex 



Term 
Intr 
Intr 
Intr 
Init 

>2618599 

len = 



27344 
27528 
27900 
28171 
28711 



27150 
27425 
27615 
27989 
28527 



/96 
1631 nex 



Term 
Intr 
Intr 
Intr 
Init 

>2618599 
len = 
Sngl 

>2618599 
len = 



27344 
27528 
27900 
28171 
28715 



27085 
27425 
27615 
27989 
28527 



/96124 
490 nex = 
61246 60764 
/13096 
1229 nex = 



Init 
Intr 
Term 



71559 71780 
72273 72395 
72494 72787 



>2618599 
len = 



/93205 
977 nex 



Init 
Intr 
Term 



71649 71780 
72273 72395 
72494 72625 



>2618599 
len = 



734629 
1114 nex 



Init 71649 71780 
60 Intr 72273 72395 
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Term 

>2618599 

len = 

Init 
Intr 
Term 

>2618599 

len = 

Term 
Intr 
Intr 
Intr 
Init 

>2618599 

len = 



72494 72762 

/29341 

1153 nex = 

71650 71780 
72273 72395 
72494 72802 

/40708 

188 5 nex = 



73223 
73417 
74198 
74521 
74712 



72970 
73329 
74135 
74458 
74609 



/37431 
2 030 nex 



Term 
Intr 
Intr 
Intr 
Intr 
Intr 
Intr 
Intr 
Init 

>2618600 
len = 
Sngl 

>2618600 

len = 

Term 
Intr 
Intr 
Intr 
Intr 
Init 

>2618600 

len = 

Term 
Init 



77559 
77742 
78021 
78386 
78575 
78745 
79060 
79293 
79506 



77489 
77644 
77830 
78249 
78489 
78659 
78899 
79157 
79391 



733948 

508 nex = 

11449 11709 

/5511 

25 9 0 nex = 

24538 24373 

24781 24651 

24969 24879 

25370 25228 

25905 25465 

26961 26323 

728326 

1413 nex = 

34043 33702 
35114 34932 



>2618600 

60 



725219 
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Init 
Intr 

5 Term 
>2618600 
len = 

10 

Sngl 

>2618600 

15 len = 

Init 
Intr 
Term 

20 

>2618600 
len = 
25 >2618600 
len = 
Init 

3 0 Intr 

Intr 
Intr 
Intr 
Term 

35 

>2618600 
len = 

4 0 Term 

Intr 
Init 



Term 
Intr 

5 0 Init 
>2618601 
len = 

55 

Term 
Intr 
Init 



36892 37172 
37302 37699 
37781 38425 

/16323 

564 nex = 

38835 39398 

/19433 

3202 nex = 

54780 54996 
55178 55275 
55405 57379 

/17022 
1656 nex = 

/34218 

2898 nex = 

70436 70933 

71361 71426 

71892 72011 

72098 72382 

72467 72739 

72924 73333 

/40280 

1001 nex = 

80824 80546 
81167 81118 
81369 81257 

798459 

1457 nex = 

19035 18675 
19433 19407 
20131 19678 

738593 

1416 nex = 

19035 18762 
19433 19407 
20177 19678 



60 >2618601 



73459 
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1481 nex 



Term 
Intr 
Init 

>2618601 
len = 
Sngl 

>2618601 
len = 



24639 24049 
24900 24741 
25529 25043 

/41015 
90 nex = 
2878 2789 

/2577 
1537 nex = 



Term 
Intr 
Init 



28396 27900 
28801 28642 
29436 28936 



>2618601 
len = 



723349 
168 0 nex ■■ 



Term 
Intr 
Init 



33360 32729 
33823 33661 
34408 33933 



>2618601 



/10032 
1835 nex = 



Term 
Intr 
Intr 
Intr 
Init 



37993 37806 

38138 38078 

38275 38222 

38435 38388 

38573 38511 



>2618601 
len = 



728362 
876 nex 



Term 
Init 



41130 40811 
41686 41620 



>2618601 
len = 



/36501 
9 70 nex ■ 



Term 
Init 



>2618601 
len = 



41130 40731 
41698 41620 



/31107 
1531 nex 



Init 44336 44550 
Intr 44852 44969 
60 Intr 45277 45424 



0 
0 
0 
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Term 

>2618602 

5 len = 

Init 
Intr 
Term 

10 

>2618602 

len = 

15 Term 
Intr 
Intr 
Intr 
Init 

20 

>2618602 
len = 
25 Sngl 
>2618602 
len = 

30 

Sngl 
>2618602 
35 len = 

Sngl 
>2618602 

40 

len = 
Sngl 

45 >2618602 
len = 
Term 

50 Intr 
Intr 
Intr 
Init 

55 >2618602 
len = 



45511 45866 

/16835 

1213 nex = 

1864 2123 
2566 2722 
2846 3076 

/30658 

1570 nex = 

22604 22192 

22912 22693 

23242 23001 

23527 23315 

23761 23601 

737357 

1215 nex = 

28713 29927 

725423 

67 7 nex = 

46717 47393 

799873 

900 nex = 

54255 53356 

77672 

267 nex = 

59695 59434 

710243 

1424 nex = 

59695 59432 

59897 59845 

60072 60011 

60287 60254 

60855 60687 

71888 

1916 nex = 



Init 66033 66354 
60 Intr 66640 66827 
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Intr 
Term 

>2618602 

len = 

Term 
Intr 
Intr 
Intr 
Init 



len = 

Term 
Intr 
Intr 
Intr 
Intr 
Intr 
Intr 
Init 



len = 

Term 
Intr 
Intr 
Intr 
Intr 
Intr 
Intr 
Init 

>2618602 

len = 

Term 
Intr 
Intr 
Intr 
Init 

>2618603 

len = 

Sngl 

>2618603 



66906 67302 

67408 67948 

/3109 

419 6 nex = 

76502 75640 

76837 76585 

77507 76944 

78819 78722 

79835 79282 

/43035 

2150 nex = 

6638 6237 

6992 6725 

7269 7075 

7586 7446 

7831 7765 

8009 7930 

8166 8088 

8396 8321 

736948 



2038 



nex 



6638 6378 

6992 6725 

7269 7075 

7586 7446 

7831 7765 

8009 7930 

8166 8088 

8415 8321 

722582 

1527 nex = 



7586 
7831 
8009 
8166 
8415 



7446 
7765 
7930 
8088 
8321 



7143299 
67 8 nex = 
13660 12983 
717250 
2810 nex = 



60 



Term 



22323 21890 



0 
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Intr 


22694 


22411 




0 


Intr 


22944 


22785 




0 


Intr 


23379 


23050 




0 


Intr 


23590 


23464 




0 


Intr 


23755 


23576 




0 


Init 


24699 


24401 




0 



>2618603 
10 len = 



/40850 
2893 nex 



Term 
Intr 
Intr 
Intr 
Intr 
Intr 
Intr 
Intr 
Intr 
Intr 
Intr 
Intr 
Init 



57315 
57504 
57696 
57938 
58165 
58344 
58476 
58720 
58890 
59036 
59195 
59447 
59897 



57005 
57416 
57585 
57801 
58019 
58270 
58435 
58601 
58816 
58974 
59126 
59288 
59584 



>2618604 
len = 



2138 



nex 



Term 
Intr 

Intr 
Intr 
Intr 
Intr 
Init 



48168 
48328 
48523 
48712 
49301 
49476 
50030 



47893 
48286 
48450 
48639 
49226 
49399 
49757 



>2618604 
len = 
Sngl 

>2618605 
len = 



2890 nex 



Term 
Intr 
Intr 
Intr 
Intr 
Intr 
Intr 
Intr 
Intr 
Init 



16301 
16453 
16645 
16810 
17032 
17226 
17695 
17967 
18278 
18991 



16104 
16385 
16540 
16755 
16973 
17142 
17601 
17845 
18215 
18733 



>2618605 

60 



/15525 



Reference No. 2750-942P 



1379 nex 



Init 

Intr 
Term 

>2618605 
len = 

>2618605 

len = 

Init 
Term 

>2618605 

len = 

Init 
Intr 
Term 

>2618605 
len = 

>2618605 

len = 

Term 
Init 

>2518605 

len = 
Sngl 
>2618605 

len = 
>2618605 

len = 
>2618605 

len = 
Sngl 
>2618605 

len = 



19682 19907 
20580 20660 
20864 21060 

/18394 

924 nex = 

737433 

1210 nex = 

19695 19907 
20580 20660 

/98771 

1392 nex = 

19695 19907 
20580 20660 
20864 21086 

74869 
1372 nex = 

76170 

1030 nex = 

21605 21167 
22188 21736 

7114037 

370 nex = 

29570 29939 

739018 

1334 nex = 

712152 

1346 nex = 

711651 

1166 nex = 

47145 48310 

78440 

1831 nex = 
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Term 
Intr 
Intr 
Intr 
Intr 
Intr 
Init 



55825 
56233 
56458 
56617 
56796 
57030 
57418 



55588 
56081 
56347 
56546 
56703 
56879 
57264 



Term 
Intr 
Init 

>2618677 

len = 

Term 
Intr 
Intr 
Intr 
Init 

>2618677 
len = 
Sngl 

>2618677 

len = 

Init 
Intr 
Intr 
Intr 
Intr 
Intr 
Intr 
Intr 
Intr 
Intr 
Intr 
Term 

>2618677 
len = 
Sngl 

>2618677 
len = 



1296 nex = 

12884 12633 

13526 13383 

13928 13599 



17 08 nex = 

28265 27997 

28700 28345 

29322 28785 

29506 29408 

29704 29600 

/111449 



/30006 
2334 nex ■ 



49398 
49719 
49924 
50096 
50217 
50414 
50565 
50755 
50894 
51110 
51254 
51417 



49629 
49823 
49986 
50129 
50339 
50476 
50658 
50800 
50961 
51158 
51319 
51731 



/30716 
804 nex = 
5723 6526 

/29451 
850 nex = 



60 Sngl 



61457 62306 



0 
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>2618677 
len = 
Sngl 
>2618677 
len = 
Sngl 
>2618677 



/141763 
915 nex = 
6460 5546 
/18592 
34 2 nex = 
76241 76582 
/452 



len 



163 0 nex 



Term 
Intr 
Intr 
Intr 
Intr 
Init 



88176 87859 

88357 88285 

88547 88476 

88713 88642 

88988 88858 

89480 89128 



>2618677 
len = 
Sngl 

>2618683 
len = 



75532 
915 ne 



1974 nex 



Term 
Intr 
Intr 
Intr 
Intr 
Intr 
Init 

>2618683 

len = 



24726 
24881 
25061 
25435 
25623 
25828 
26375 



24402 
24816 
24963 
25239 
25576 
25711 
26175 



2051 nex ■ 



Term 
Intr 
Intr 
Intr 
Intr 
Intr 
Init 



24726 
24881 
25061 
25435 
25623 
25828 
26447 



24397 
24816 
24963 
25239 
25576 
25711 
26175 



>2618683 
len = 



Init 27061 27168 
60 Intr 27764 28346 
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Intr 
Intr 
Intr 
Term 

>2618683 

len = 



28448 28669 

28748 28995 

29088 29227 

29422 29722 



72657 



2247 



nex = 



Term 
Intr 
Intr 
Intr 
Intr 
Intr 
Init 

>2618683 



56579 56243 

56748 56672 

56919 56831 

57186 57056 

57409 57280 

57542 57518 

58489 57939 

/2085 



Init 
Term 

>2618683 
len = 
Sngl 

>2623294 
len = 



64780 64842 
64924 65340 

735456 
1030 nex = 
68721 69748 

76610 
1969 nex = 



Init 
Term 



>2623294 



13412 14121 
14135 14416 



711739 
521 nex 



Init 
Term 



27204 27453 
27544 27724 



>2623294 
len = 



71297 
1365 nex ■■ 



Init 
Intr 
Term 

>2623294 

len = 



52988 53175 
53533 53759 
53848 54135 

72337 

4786 nex = 



Init 
Intr 
Intr 
Term 



65365 65544 

65952 66570 

66666 66939 

67114 67245 
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>2623294 

len = 

Init 
Term 

>2642152 
len = 

>2642152 
len = 
Sngl 

>2642152 

len = 

Term 
Init 

>2642152 
len = 
Sngl 

>2642152 

len = 

Init 
Intr 
Intr 
Term 

>2642152 

len = 

Init 
Intr 
Intr 
Term 

>2642152 

Sngl 
>2642152 
len = 
Init 



/30905 

9 84 nex = 

71769 71892 
72257 72752 

/105724 

1015 nex = 

/5605 

1069 nex = 

12014 12116 

/123288 

60 0 nex = 

13599 13304 
13903 13724 

/18133 

271 nex = 

21652 21382 

78256 

3019 nex = 

25623 26134 

26620 26968 

27055 27331 

28305 28641 

737775 

2269 nex = 

31242 31353 

32126 32188 

32277 32516 

32603 33510 

7111626 

250 nex = 

33270 33510 

727496 

2548 nex = 

33756 33970 
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Intr 
Intr 
Intr 
Intr 
5 Intr 
Intr 
Intr 
Term 

10 >2642152 



34189 
34335 
34530 
34670 
35009 
35365 
35796 
35999 



34217 
34408 
34631 
34777 
35100 
35525 
35880 
36303 



Init 
Intr 
Term 

>2642152 

len = 

Term 
Intr 
Intr 
Intr 
Intr 
Intr 
Intr 
Intr 
Intr 
Intr 
Intr 
Intr 
Init 



39520 39810 
39958 40298 
40402 40687 

/17221 

2364 nex = 



41130 
41304 
41463 
41617 
41866 
42123 
42323 
42505 
42680 
42849 
43018 
43229 
43431 



41068 
41234 
41397 
41549 
41716 
42048 
42220 
42409 
42594 
42763 
42929 
43152 
43342 



45 



Term 
Intr 
Intr 
Intr 
Intr 
Init 



50 



>2642152 

len = 

Sngl 

>2642427 

55 len = 

Term 
Init 



2590 nex = 

52943 52448 

53435 53162 

53934 53766 

54202 54122 

54359 54302 

55032 54833 

/114221 

321 nex = 

69754 70074 

728853 

1030 nex = 

106334 106067 

107087 106553 



60 >2642427 



/30636 
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len = 

Term 
5 Intr 
Init 

>2642427 

10 len = 

Term 
Intr 
Init 

15 

>2642427 

len = 

2 0 Term 
Intr 
Init 

>2642427 

25 

len = 
Sngl 

30 >2642427 
Sngl 

35 

>2642427 

len = 

40 Term 
Init 

>2542427 

4 5 len = 

Term 
Intr 
Intr 

50 Intr 
Intr 
Intr 
Intr 
Intr 

55 Init 
>2645198 
len = 

60 



1167 nex = 

25457 25117 
25685 25560 
26283 25787 

738686 

1419 nex = 

25457 25104 
25685 25560 
26522 25787 

/25204 

1692 nex = 

32565 32226 
32850 32670 
33917 33379 

/40307 

1118 nex = 

45533 46650 

/40943 

631 nex = 

48222 47592 

/22551 

670 nex = 

5656 5170 
5836 5750 

722936 

3101 nex = 

5656 5284 

5932 5750 

6183 6016 

6419 6285 

6590 6501 

6761 6684 

7515 6859 

7718 7600 

8384 7931 

728033 

817 nex = 



1003 

3 

0 
0 
0 



3 

0 
0 
0 



3 

0 
0 
0 



4- 0 



0 



2 

0 
0 



9 

0 
0 
0 
0 
0 
0 
0 
0 
0 
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Sngl 
>2645198 
len = 



15766 16582 
/144393 
651 nex - 



Term 
Init 



43394 43065 
43715 43483 



>2645198 
len = 



/34660 
676 nex ■■ 



Term 
Init 



51571 51243 
51918 51697 



>2645198 
len = 



/21053 
691 nex 



Term 
Intr 
Init 

>2645198 

len = 



54371 54189 
54602 54463 
54879 54688 

736567 

1873 nex = 



Term 
Init 



6681 6341 
8213 7815 



>2551294 
len = 



1979 



Init 
Intr 
Intr 
Intr 
Intr 
Intr 
Intr 
Term 



119 
563 
984 
1166 
1300 
1482 
1671 
1814 



467 
788 
1065 
1223 
1388 
1577 
1693 
2097 



>2651294 



/99409 
1607 nex 



Init 
Intr 
Term 

>2651294 

len = 

Init 
Intr 
Intr 
Term 



12703 12929 

13029 13341 

13833 14309 

/19516 

13 8 0 nex = 

17001 17093 

17192 17276 

17399 17532 

17965 18220 
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>2651294 

Sngl 

>2651294 

len = 

Sngl 

>2651294 

Len = 

Term 
Intr 
Intr 
Init 

>2651294 

len = 

Sngl 

>2651294 

len = 

Sngl 

>2651294 

len = 

Init 
Intr 
Term 

>2651294 

len = 

Init 
Intr 
Intr 
Intr 
Intr 
Intr 
Intr 
Intr 
Intr 
Intr 
Intr 
Intr 
Intr 
Intr 



725574 

610 nex = 

29291 28700 

/15024 

695 nex = 

35758 35064 

/28121 

1330 nex = 

43723 43497 

44224 44091 

44474 44390 

44667 44575 

/120147 

792 nex = 

48787 48117 

/12956 

107 8 nex = 

55272 56349 



59647 59727 
59886 60144 
60429 60879 



3115 nex 



80840 
81295 
81548 
81692 
81829 
82014 
82222 
82425 
82617 
82768 
82979 
83207 
83465 
83626 



81063 
81464 
81599 
81717 
81912 
82124 
82304 
82515 
82568 
82858 
83101 
83329 
83531 
83705 
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83788 


83954 




0 




>2656024 


72925 






5 


len = 


2110 




6 






init 


21471 


21726 




0 




Intr 


21825 


21900 


+ 


0 




Intr 


21980 


22075 




0 


10 


Intr 


22220 


22557 


+ 


0 






23046 


23175 




0 




Term 


23299 


23578 




0 




>2656024 


/10982 






15 














len 


2112 




8 






Term 


36311 


35994 


- 


0 




Intr 


36483 


36427 




0 


20 


Intr 


36652 


36568 




0 




Intr 


36876 


36751 


- 


0 






37003 


36960 








Intr 


37209 


37090 








Intr 


37445 


37291 


- 


0 


25 


Init 


38105 


38035 




0 




>2656024 


/115913 








len = 


454 


nex = 


1 




3 0 














ng 


69260 


69713 








>2656025 


/42551 






35 


len = 


1597 


nex = 


3 






Init 


21954 


22111 


+ 


0 




^^^^ 


22318 


22630 




0 




arm 


22710 


23356 






4 0 














>2656025 


/17690 








len = 


2548 


nex = 


6 




45 


Init 


26822 


27289 


+ 


0 




Intr 


27502 


27719 


+ 


0 






27833 


27945 








Intr 


28683 


28746 




0 






28894 


28947 




0 


5 0 


T^rm 
erm 


29171 


29369 








>2656025 


/31580 










2657 


nex = 


3 




55 














Term 


34185 


33154 




0 




Intr 


34744 


34283 




0 




Init 


35810 


35427 




0 



60 >2656025 



/21044 
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nex 



Term 
Intr 
Intr 
Intr 
Intr 
Intr 
Intr 
Intr 
Intr 
Intr 
Intr 
Intr 
Intr 
Intr 
Intr 
Init 

>2556026 
len = 
Sngl 

>2656026 
len = 

Init 

Term 

>2656026 

len = 

Init 
Term 

>2656026 

len = 

Init 
Intr 
Intr 
Term 

>2656026 
len = 
Sngl 

>2656026 
len = 



36770 
36904 
37063 
37298 
37535 
37706 
37887 
38077 
38499 
38817 
38980 
39610 
40211 
40458 
40839 
41116 



641 



36474 
36853 
37002 
37211 
37467 
37649 
37786 
37976 
38169 
38701 
38912 
39366 
39972 
40363 
40723 
40940 



nex 



21786 22426 

732253 

1708 nex = 

22982 24225 
24310 24689 

739882 

1499 nex = 

22997 24225 
24310 24471 

/205741 

1213 nex = 

33785 34004 

34198 34304 

34400 34430 

34537 34990 

727452 

1064 nex = 

39160 40223 

734082 

773 nex = 



60 Sngl 47232 46460 



0 
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>2656026 

len = 

Init 
Intr 
Term 

>2656026 
len = 
Sngl 

>2656026 
len = 

>2656027 

len = 

Init 
Intr 
Intr 
Term 

>2656027 

len = 

Init 
Intr 
Intr 
Term 

>2656027 

len = 

Init 
Intr 
Intr 
Term 

>2656027 

len = 

Init 
Intr 
Intr 
Term 

>2656028 

len = 

Init 



742826 

1690 nex = 

47575 48443 
48536 48674 
48854 49260 

79664 

810 nex = 

50539 49730 

716760 

3571 nex = 

737474 

5143 nex = 

22714 22946 

23554 23642 

23726 23824 

23923 24012 

7117516 

1426 nex = 

22758 22946 

23554 23642 

23726 23824 

23923 24183 

796557 

1396 nex = 

22776 22946 

23554 23642 

23726 23824 

23923 24171 

734532 

2155 nex = 

24706 25078 

25689 26087 

26183 26517 

26605 26860 

739718 

910 nex = 

21537 21782 
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>2656028 
5 len = 



22349 22446 
/11128 



1591 



nex 



Init 
Intr 
Intr 
Intr 
Term 



21555 21782 

22349 22446 

22555 22722 

22802 22844 

22931 23145 



>2656028 
15 len = 



/38185 



2448 



Term 
Intr 
Intr 
Intr 
Intr 
Intr 
Intr 
Init 



23351 
23563 
23838 
24026 
24195 
24373 
24640 
25228 



23070 
23476 
23740 
23922 
24126 
24277 
24499 
25069 



>2656028 
len = 



2579 



nex ■■ 



3 0 Term 
Intr 
Intr 
Intr 
Intr 

35 Intr 
Intr 
Intr 
Init 

40 >2656028 



23351 
23563 
23838 
24026 
24195 
24373 
24640 
25228 
25632 



23054 
23476 
23740 
23922 
24126 
24277 
24499 
25069 
25564 



len = 

Term 
Intr 
Intr 
Intr 
Intr 
Intr 
Intr 
Intr 
Intr 
Init 
>2655028 



2454 

39978 
40161 
40298 
40511 
40705 
40916 
41213 
41361 
41523 
42077 



nex = 

39624 
40113 
40244 
40391 
40604 
40809 
40996 
41289 
41447 
41767 



len = 
Sngl 



690 nex = 
55512 55324 



60 >2656028 



/36094 
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2124 nex 



Init 
Intr 
Term 

>2656028 

len = 

Term 
Intr 
Intr 
Init 

>2656028 

len = 



Init 
Intr 
Intr 
Intr 
Intr 
Intr 
Intr 
Term 

>2656028 

len = 

Sngl 

>2656028 

len = 

Sngl 

>2656028 

len = 

Sngl 

>2656028 

len = 

Init 
Intr 
Intr 
Intr 
Term 



56165 56791 
56822 56893 
57886 58288 

/3303 

16 67 nex = 

58518 58274 

58878 58601 

59124 58950 

59940 59506 

/18435 

2060 nex = 

60472 60847 

60939 61071 

61351 61440 

61533 61657 

61748 61790 

61872 61989 

62089 62138 

62218 62531 

/11539 

8 64 nex = 

65472 66335 

/34900 

430 nex = 

75838 76262 

/16977 

430 nex = 

75840 76262 

/155377 

850 nex = 

9258 9512 

9598 9659 

9739 9790 

9876 9990 

10076 10107 



/2641 



len = 



1789 nex = 
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Init 
Intr 
Intr 
Intr 
Term 

>2656029 
len = 
Sngl 

>2556029 

len = 

Init 
Intr 
Intr 
Intr 
Intr 
Intr 
Term 

>2656029 
len = 
Sngl 

>2656029 

len = 

Term 
Intr 
Init 

>2656029 
len = 
Sngl 

>2656029 

len = 

Init 
Intr 
Intr 
Term 

>2656029 

len = 

Sngl 



15938 16144 

16242 16362 

16458 16549 

16632 16728 

16810 17371 

/1843 

670 nex = 

18752 18085 

/578 

163 0 nex = 

36717 36759 

36898 36971 

37133 37178 

37268 37338 

37454 37498 

37595 37663 

37791 38345 

722827 

8 84 nex = 

374 1257 

75589 

1459 nex = 

43267 42861 
43429 43348 
43780 43535 

730314 

915 nex = 

48635 49549 

738273 

3043 nex = 



6585 
7927 
8561 
9213 



6905 
8060 
8708 
9319 



710991 
550 nex = 
66286 66832 
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>2656030 

len = 

Init 
Intr 
Term 

>2656030 

len = 

Term 
Intr 
Intr 
Intr 
Init 

>2656031 

len = 

Init 
Term 

>2656032 

len = 

Init 
Intr 
Intr 
Intr 
Intr 
Intr 
Intr 
Intr 
Intr 
Term 

>2656032 

len = 

Sngl 

>2660661 

len = 

Sngl 

>2660661 

len = 

Term 
Init 



78500 78616 
78721 78891 
79064 79432 



1630 nex = 

79640 79440 

79934 79737 

80170 80021 

80795 80709 

81069 80880 

/6333 

1318 nex = 

18702 19086 

19654 20019 



3269 

17542 
17886 
18311 
18505 
18958 
19218 
19444 
19841 
20110 
20311 



nex = 

17641 
18084 
18413 
18575 
19119 
19355 
19737 
19993 
20205 
20555 



/1Q48 

1073 nex = 

34012 32940 

/31042 

1330 nex = 

12325 13650 

/1383 

103 0 nex = 

64341 63993 
65022 64911 



60 >2660661 



/39167 
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Init 
Intr 
Term 



77572 77856 
77938 78144 
78228 78610 



>2660661 



247 4 nex 



Term 
Intr 
Intr 
Intr 
Intr 
Intr 
Intr 
Init 

>2673901 
len = 
Sngl 

>2673901 

len = 

Init 
Intr 
Term 



78971 
79182 
79367 
79521 
79694 
79951 
80178 
81114 



78641 
79069 
79265 
79457 
79606 
79775 
80029 
80933 



/12267 

1283 nex = 

21553 21782 

/33002 

2397 nex = 

25887 26600 
27248 27393 
27470 28283 



>2673901 
len = 



Init 
Term 



29263 29613 
30084 30468 



>2673901 

len = 

Term 
Intr 
Intr 
Intr 
Intr 
Intr 
Intr 
Intr 
Init 



3159 

32389 
32585 
32774 
33085 
33504 
33696 
34178 
34443 
35212 



nex = 

32054 
32474 
32682 
32882 
33364 
33592 
33798 
34303 
34776 



>2673901 
len = 



60 Sngl 



35473 35707 



0 
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>2689438 
len = 
Sngl 

>2689438 

len = 

Term 
Intr 
Intr 
Init 

>2689438 

len = 

Init 
Intr 
Intr 
Term 

>2689438 
len = 
Sngl 

>2689438 

len = 

Term 
Intr 
Intr 
Intr 
Intr 
Intr 
Intr 
Init 

>2696018 

len = 

Sngl 

>2696018 

len = 

Sngl 

>2696018 

len = 



/151406 

4 72 nex = 

16845 17316 

/3000 

2 97 3 nex = 

25402 24983 

26284 26031 

27230 26975 

27948 27691 

/31290 

1945 nex = 

48660 48849 

48947 49173 

49650 49983 

50177 50604 

/25158 

448 nex = 

50196 50643 

/35916 

2140 



72752 
72927 
73134 
73359 
73775 
73908 
74175 
74495 



nex = 

72356 
72844 
73078 
73231 
73689 
73861 
74019 
74380 



/143114 
250 nex = 
24191 24437 
75698 
1017 nex = 
32527 32321 
792795 
654 nex = 
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Snql 


49709 


49056 


_ 


0 




>2696018 


/6629 






5 


len = 


2745 


nex = 


8 






Init 


56009 


56062 


+ 


0 




Intr 


56175 


56325 


+ 


0 




Intr 


56866 


56932 


+ 


0 


10 


Intr 


57026 


57174 


+ 


0 




Intr 


57438 


57571 


+ 


0 




Intr 


57657 


57755 


+ 


0 




Intr 


57849 


57924 




0 






58005 


58355 


+ 


0 


15 














>2696018 


/11275 








len = 


2084 


nex = 


9 




20 


Init 


56016 


56062 


+ 


0 




Intr 


56175 


56325 


+ 


0 




Intr 


56866 


56932 


+ 


0 




Intr 


57026 


57174 


+ 


0 




Intr 


57370 


57405 


+ 


0 


25 


Intr 


57438 


57571 


+ 


0 




Intr 


57657 


57755 


+ 


0 






57849 


57924 


+ 


0 




Term 


58005 


58099 


+ 


0 


30 


>2696018 


/4330 








1 n - 
en 


1842 










Term 


67677 


67445 


- 


0 


35 


Intr 


67827 


67768 




0 




Intr 


68176 


68136 


- 


0 




Intr 


68784 


68715 




0 






68912 


68859 




0 




Init 


69286 


69237 


- 


0 


40 














>2696018 


/37111 








len = 


992 


nex = 


2 




45 




79220 


78599 




0 




Init 


79576 


79309 




0 




>2702261 


/26812 






50 


len 


2400 


nex - 


7 






Init 


2306 


2536 


+ 


0 




Intr 


2938 


2981 


+ 


0 




Intr 


3093 


3250 


+ 


0 


55 


Intr 


3640 


3767 


+ 


0 




Intr 


3879 


4001 




0 




Intr 


4101 


4220 


+ 


0 




Term 


4310 


4705 




0 



60 >2702261 



/3972 
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len = 


2490 


nex = 


8 






Init 


2306 


2536 


+ 


0 


5 


Intr 


2938 


2981 


+ 


0 




Intr 


3093 


3250 


+ 


0 




Intr 


3422 


3557 


+ 


0 




Intr 


3640 


3767 


+ 


0 






3879 


4001 


+ 


0 


10 


Intr 


4101 


4220 


+ 


0 




Term 


4310 


4795 




0 




>2702261 


/38147 






15 


len = 


2387 


nex = 


6 






Init 


23695 


23892 


+ 


0 




Intr 


24225 


24350 


+ 


0 




Intr 


24510 


24614 


+ 


0 


20 




24888 


24956 


+ 


0 




Intr 


25075 


25188 


+ 


0 




Term 


25563 


26081 




0 




>2702261 


/18783 






25 














en 


3623 










Init 


85480 


85974 


+ 


0 




Intr 


86065 


86230 




0 


30 


Intr 


86316 


86366 


+ 


0 




Intr 


86456 


86638 


+ 


0 






86718 


86811 




0 




Intr 


86890 


87554 


+ 


0 




Term 


87651 


87783 


+ 


0 


35 














>2708736 


/18625 








Len = 


1123 


nex = 


2 




40 


Term 


21014 


20394 




0 




Init 


21516 


21326 




0 



>2708736 
len = 
Sngl 
>2708736 

Init 
Term 

>2708736 

len = 

Sngl 



/18284 
733 nex = 
2752 2020 

/33031 

509 nex = 

39554 39695 
39767 40041 

/22711 

766 nex = 

56611 57376 
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>2708736 

len = 

Sngl 

>2708736 

len = 

Sngl 

>2708736 

len = 

Init 
Intr 
Intr 
Term 

>2708735 

len = 

Init 
Intr 
Intr 
Term 

>2739359 

len = 

Term 
Intr 
Intr 
Intr 
Init 

>2739359 

len = 

Init 
Term 

>2739359 

len = 

Sngl 

>2739359 

len = 

Sngl 



/12534 

808 nex = 

57189 56382 

724747 

1231 nex = 

59076 60306 

79676 

895 nex = 

63519 63714 

63796 63961 

64080 64133 

64232 64413 

740806 

217 0 nex = 

66493 67471 

67805 67971 

68059 68143 

68241 68655 

722595 

2809 nex = 

29529 28886 

30066 29793 

30458 30289 

30730 30590 

30933 30852 

79368 

715 nex = 

55832 55932 
56029 56546 

730772 

67 3 nex = 

55861 55932 

714528 

644 nex = 

56029 56527 
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2950 



Term 
Intr 
Intr 
Intr 
Intr 
Intr 
Intr 
Intr 
Intr 
Intr 
Intr 
Init 

>2739359 

len = 

Sngl 

>2739359 

len = 

Sngl 

>2739359 

len = 

Term 
Intr 
Intr 
Intr 
Intr 
Init 

>2749918 

len = 

Term 
Intr 
Intr 
Init 

>2749918 

len = 

Term 
Intr 
Intr 
Intr 
Init 



3431 
3534 
3857 
4371 
4756 
4924 
5064 
5348 
5522 
5694 
5801 
6098 



3158 
3459 
3781 
4277 
4623 
4846 
4995 
5159 
5428 
5605 
5770 
5900 



/628 
82 7 nex = 
65777 66603 
/33680 
6 34 nex = 
69683 69050 
/1999 
1979 nex = 



79655 
79924 
80307 
80697 
81152 
81468 



79490 
79752 
80005 
80536 
80784 
81236 



727548 

1572 nex = 

118910 118818 

119099 119049 

119342 119186 

119799 119495 

72157 

1164 nex = 

33097 32732 

33226 33170 

33418 33313 

33578 33520 

33895 33656 



741337 
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len ■ 



2907 nex 



Term 
Intr 
Intr 
Intr 
Intr 
Intr 
Intr 
Init 

>2749918 

len = 

Term 
Intr 
Intr 
Intr 
Intr 
Intr 
Intr 
Init 

>2760164 

len = 

Term 
Intr 
Init 

>2760164 
len = 
Sngl 

>2760164 

len = 

Term 
Intr 
Intr 
Init 

>2760164 

Sngl 
>2760164 
len = 



40133 
40316 
40452 
40631 
40820 
41443 
41920 
42758 



2249 

73192 
73427 
73554 
73679 
73857 
74009 
74203 
75204 



39852 
40251 
40399 
40560 
40725 
41366 
41841 
42004 



nex = 

72956 
73285 
73513 
73638 
73758 
73942 
74119 
74990 



/122489 

748 nex = 

1929 1667 
2065 2000 
2414 2235 

/141953 

404 nex = 

40494 40091 

76393 

18 8 7 nex = 

45207 44626 

45909 45643 

46332 46110 

46512 46432 

/12030 

1175 nex = 

62893 63617 

75684 

1901 nex = 



Init 71453 71775 
60 Intr 71875 71990 
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Intr 
Intr 
Intr 
Term 



72555 72678 

72782 72930 

73031 73093 

73185 73353 

/36046 



Init 
Intr 
Term 

>2760164 

len = 

Init 
Intr 
Term 

>2760165 
len = 
Sngl 

>2760165 

len = 

Init 
Intr 
Intr 
Intr 
Term 

>2760165 
len = 
Sngl 

>2760165 

len = 

Term 
Intr 
Intr 
Init 

>2760165 

len = 

Term 
Intr 
Intr 
Intr 



871 1095 
1189 1297 
1413 1767 



991 1095 
1189 1297 
1413 1748 

/19114 

1646 nex = 

20275 19212 

/4193 

1620 nex = 



30702 
30939 
31132 
31420 
31892 



30854 
31019 
31326 
31797 
32219 



/17486 

386 nex = 

32921 32536 

/17455 

1330 nex = 

32949 32576 

33144 33034 

33293 33222 

33897 33392 

72869 

2553 nex = 

32949 32636 

33144 33034 

33293 33222 

34521 33392 
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Intr 
Init 

>2760165 

len = 

Term 
Intr 
Init 

>2760165 

len = 

Init 
Intr 
Intr 
Term 

>2760165 

len = 

Term 
Intr 
Intr 
Init 

>2760165 

len = 

Init 
Intr 
Term 

>2760165 

len = 

Init 
Intr 
Term 

>2760166 

len = 

Init 
Intr 
Term 

>2760166 

len = 



34888 34819 
35188 34989 

/1022 

1103 nex = 

37453 37053 
37627 37537 
38155 37697 

735974 

1669 nex = 

42735 42859 

43198 43462 

43554 43748 

43823 44403 

/34861 

2759 nex = 

45311 44615 

45749 45508 

46640 46252 

47373 47034 

/146274 

1378 nex = 

56343 56430 
56568 56695 
56823 57131 

72996 

1590 nex = 

6987 7404 
7510 8043 
8152 8576 

737308 

1640 nex = 

36323 36769 
36888 37165 
37289 37962 

737792 

1517 nex = 



Init 36888 37165 
Term 37289 37987 

60 
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>2760166 
len = 

Term 

Init 

>2760167 

len = 

Term 
Intr 
Intr 
Intr 
Intr 
Intr 
Intr 
Intr 
Intr 
Intr 
Init 

>2760167 

len = 

Sngl 

>2760167 

len = 

Sngl 

>2760167 

len = 

Term 
Intr 
Init 

>2760167 

len = 

Term 
Intr 
Init 

>2760167 

len = 

Init 
Intr 
Term 



5587 5255 
6016 5725 



2810 

32480 
32689 
32895 
33101 
33238 
33451 
33637 
33811 
33992 
34225 
34757 



nex = 

31948 
32573 
32777 
32993 
33200 
33373 
33563 
33747 
33937 
34133 
34493 



/101988 

1033 nex = 

40367 40745 

/92179 

993 nex = 

40367 40745 

/13697 

1356 nex = 

42553 42287 
42795 42646 
43642 43418 

/30674 

804 nex = 

44295 44028 
44536 44458 
44831 44714 

/7119 

1930 nex = 

47503 47617 
47702 48982 
49233 49428 



60 >2760167 



/9480 
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1375 



nex 



Init 
Intr 
Intr 
Intr 
Intr 
Term 



Sngl 
>2760167 



Init 
Intr 
Intr 
Intr 
Intr 
Term 



51156 
51501 
51659 
51994 
52170 
52439 



51210 
51557 
51776 
52083 
52349 
52514 



735248 
116 nex = 
52672 52787 
/20790 
1918 



53351 
53877 
54137 
54434 
54690 
54954 



nex = 

53565 
54048 
54333 
54601 
54818 
55268 



32 7 2 nex = 



Init 
Intr 
Intr 
Intr 
Intr 
Intr 
Intr 
Intr 
Intr 
Intr 
Intr 
Intr 
Intr 
Term 

>2760167 

len = 

Term 
Intr 
Intr 
Init 

>2760167 

len = 



6342 
6499 
6719 
6858 
6960 
7214 
7382 
7542 
7667 
8046 
8260 
8439 
8647 
8935 



6410 
6570 
6773 
6876 
7027 
7288 
7437 
7586 
7930 
8140 
8350 
8549 
8841 
9304 



737493 

3313 nex = 

66659 66266 

68318 68043 

69011 68592 

69578 69461 

77548 

1930 nex = 

79732 79374 



Reference No. 2750-942P 



Intr 
Intr 
Intr 
Intr 
Init 

>2760167 
len = 
SngL 

>2760167 

len = 

Term 
Intr 
Init 

>2760168 

len = 

Init 
Intr 
Term 

>2760168 

len = 

Term 
Init 

>2760168 

len = 

Sngl 

>2760168 

len = 

Sngl 

>2760168 

len = 

Init 
Intr 
Intr 
Intr 
Intr 
Intr 
Term 



80062 79829 

80336 80169 

80655 80426 

80848 80749 

81298 80951 

/10973 
1259 nex = 
82221 82174 

/10749 

133 0 nex = 

81820 81596 
82221 82174 
82918 82706 

/1165 

1197 nex = 

13720 13876 
14260 14352 
14686 14916 

742577 

796 nex = 

23611 23112 
23907 23692 

/12729 

216 nex = 

2967 2752 

/19343 

612 nex = 

2989 2378 

/17242 

2173 nex = 



31582 
31798 
32205 
32354 
32868 
33072 
33388 



31714 
31982 
32264 
32635 
32963 
33268 
33754 



60 >2760168 



794968 
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len = 

Sngl 

>2760168 

len = 

Init 
Intr 
Term 



len = 

Init 
Intr 

2 0 Intr 
Intr 
Intr 
Intr 
Intr 

25 Intr 
Intr 
Intr 
Term 

30 >2760169 
len = 
Init 

35 Term 
>2760169 
len = 

40 

Sngl 
>2760169 
4 5 len = 

Sngl 
>2760169 

50 

len = 
Sngl 

55 >2760169 
len = 



467 nex = 

3825 4291 

/152076 

1210 nex = 

598 955 
1255 1356 
1441 1806 

78993 

2713 nex = 

1 127 

474 621 

698 769 

943 1014 

1227 1298 

1396 1539 

1636 1716 

1844 1886 

1966 2037 

2143 2214 

2426 2713 

/14492 

651 nex = 

20745 20918 
21014 21373 

/5810 

1439 nex = 

24319 23980 

/38421 

1851 nex = 

44204 42354 

/16827 

564 nex = 

48497 49060 

/12970 

1536 nex = 



Term 72431 71753 
60 Intr 72794 72517 
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>2760170 



len = 

Term 
Intr 
Intr 
Intr 
Intr 
Intr 
Init 



8439 
8753 
8923 
9409 
9622 
9990 
10341 



nex = 

7872 
8655 
8846 
9327 
9525 
9959 
10081 



>2760170 
len = 



Term 
Init 



/100904 
616 nex 



72185 72007 
72622 72442 



>2760170 
len = 



Term 
Init 



/4536 
706 nex 



72185 71972 
72677 72442 



>2760170 
len = 



Term 
Init 



72185 72004 
72687 72442 



>2760170 
len = 



1994 



Init 
Intr 
Intr 
Intr 
Term 



79105 79274 

79891 79988 

80281 80424 

80520 80749 

80836 81098 



74585 
2746 nex 



Term 
Intr 
Intr 
Intr 
Intr 
Intr 
Intr 
Intr 
Intr 
Intr 
Intr 



25736 
25904 
26088 
26440 
26630 
26871 
27087 
27378 
27618 
27801 
27896 



25556 
25816 
25978 
26311 
26532 
26767 
26953 
27174 
27490 
27711 
27863 
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Init 

>2760171 

len = 

Init 
Intr 
Intr 
Term 

>2760171 

len = 

Init 
Intr 
Intr 
Term 

>2760171 

Init 
Intr 
Intr 
Intr 
Intr 
Intr 
Intr 
Term 

>2760172 

len = 

Term 
Intr 
Intr 
Init 

>2760172 

len = 

Sngl 

>2760172 

len = 

Sngl 

>2760172 

len = 



28301 28131 



1349 nex = 

51419 51557 

51763 51863 

51971 52167 

52288 52767 



/31563 



2637 



nex ■ 



7482 7708 

7821 7962 

8116 8177 

8473 8676 

/37100 

1735 nex = 

69229 69386 

69531 69707 

69794 69925 

70009 70107 

70188 70304 

70399 70494 

70648 70760 

70850 70954 

724599 

730 nex = 

26279 26046 

26489 26370 

26593 25564 

26767 26693 

/25884 

67 0 nex = 

51002 51664 

/8981 

357 nex = 

51004 51353 

/21903 

94 0 nex = 



Sngl 57422 58361 

60 



0 
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>2760172 
len = 



/14033 
1585 nex 



Term 
Intr 
Intr 
Init 



71969 71571 

72172 72083 

72354 72271 

73155 72644 



>2760172 
len = 
Sngl 

>2760173 
len = 



/205721 
693 nex = 
78960 78268 
/41408 
1404 nex = 



Init 
Intr 
Term 



2606 2664 
2991 3372 
3572 4009 



>2760173 
len = 



/100590 
1692 nex ^ 



Init 
Intr 
Intr 
Intr 
Term 



27003 27364 

27452 27570 

27661 27790 

27912 28130 

28473 28694 



>2760173 
len = 



/14898 
1241 nex = 



Term 
Intr 
Init 



43600 43275 
44232 44110 
44515 44319 



>2760173 
len = 



/7791 
1186 nex ^ 



Term 
Intr 
Init 



43600 43330 
44232 44110 
44515 44319 



>2760173 
len = 



/15331 
1000 nex 



Term 
Intr 
Init 



43600 43516 
44232 44110 
44515 44319 



60 len = 



1047 nex = 



2 
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Init 
Term 

>2760316 

len = 

Init 
Term 

>2760316 

len = 

Init 
Term 

>2760316 

len = 

Term 
Init 

>2760316 

len = 

Term 
Init 

>2760316 
len = 
Sngl 

>2760316 

len = 

Init 
Intr 
Intr 
Intr 
Intr 
Intr 
Intr 
Intr 
Term 

>2760316 

len = 

Init 
Intr 
Intr 
Term 



9992 10398 
10735 11038 

/42320 

1227 nex = 

13777 14194 
14314 15003 

/14043 

1172 nex = 

13832 14194 
14314 15003 

728426 

1303 nex = 

15264 15087 
15991 15736 

71542 

13 58 nex = 

15264 15084 
15991 15736 

77553 

1150 nex = 

36870 38015 

713346 

2352 nex = 

61238 61435 

61585 61780 

61871 61995 

62136 62200 

62323 62535 

62670 62717 

62833 62940 

63036 63176 

63264 63589 

727632 

1338 nex = 

86833 87026 

87327 87427 

87723 87820 

87960 88170 
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>2760829 
len = 

5 

Init 
Intr 
Term 

10 >2760829 
len = 
Term 

15 Init 
>2760829 
len = 

20 

Sngl 

>2760829 

2 5 len = 

Term 
Init 

30 >2795802 
len = 



35 



Term 
Intr 
Init 



>2795802 
4 0 len = 

Sngl 
>2815404 

45 

len = 
Sngl 

50 >2815404 
len = 
Sngl 

55 

>2815404 
len = 
60 Sngl 



733692 

1822 nex = 

11711 12231 
12887 13110 
13186 13532 

/41755 

1546 nex = 

14354 13832 
15377 14681 

728786 

1722 nex = 

70 1791 

74754 

970 nex = 

77740 77286 
77876 77815 

717353 

1734 nex = 

20024 19317 
20663 20345 
21050 20759 

727620 

67 0 nex = 

32381 33046 

742968 

610 nex = 

14332 14937 

722971 

620 nex = 

14341 14960 

79731 

675 nex = 

35713 35039 
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>2815404 

len = 

Init 
Intr 
Intr 
Intr 
Intr 
Intr 
Intr 
Term 

>2815404 

len = 

Init 
Intr 
Intr 
Term 

>2815404 

Init 
Term 

>2815404 

len = 

Sngl 

>2815519 

Term 
Intr 
Intr 
Intr 
Intr 
Init 

>2815519 

len = 

Sngl 

>2815519 

len = 

Sngl 



/18451 
2372 nex 



45075 
45497 
45704 
46227 
46695 
46860 
47066 
47312 



45423 
45617 
46131 
46383 
46777 
46956 
47135 
47446 



/15499 

1870 nex = 

67484 67838 

68518 68746 

68868 68992 

69083 69344 

/18800 

910 nex = 

70956 71114 
71453 71857 

79878 

687 nex = 

77572 76886 

/96891 

1073 nex = 

22493 22461 

22634 22576 

22953 22881 

23103 23039 

23313 23201 

23533 23402 

/25380 

750 nex = 

26997 26248 

78432 

310 nex = 

27005 26703 



60 >2815519 



736424 
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len = 
Sngl 
>2815519 
len = 
Sngl 
>2815519 
len = 
Sngl 
>2815519 
len = 
Sngl 
>2815519 
len = 
Sngl 
>2815519 

Sngl 

>2815519 

len = 

Sngl 

>2815519 

len = 

Sngl 

>2815519 

len = 

Init 
Intr 
Intr 
Intr 
Intr 
Intr 
Intr 
Intr 
Term 



412 nex = 
27008 26597 
/2327 
394 nex = 
27008 26615 
/113981 
819 nex = 
27010 26192 
/7630 
7 63 nex = 
27010 26248 
/31731 
762 nex = 
27010 26249 
/35047 
39 6 nex = 
27010 26615 
/16438 
400 nex = 
27014 26615 
736235 
773 nex = 
27020 26248 
/20963 
2032 nex = 



44481 
44737 
44989 
45281 
45505 
45673 
45891 
46172 
46318 



44658 
44883 
45105 
45424 
45591 
45788 
46072 
46229 
46512 
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>2815519 
len = 



/4249 
1602 nex 



Init 


47167 


47260 


+ 


0 


Intr 


47353 


47634 


+ 


0 


Intr 


47724 


47945 


+ 


0 


Intr 


48032 


48196 


+ 


0 


Term 


48281 


48768 


+ 


0 



>2815519 
len = 

15 

Init 
Intr 
Intr 
Term 

20 

>2815519 
len = 

2 5 Init 

Term 

>2815519 

3 0 len = 

Sngl 
>2815519 

35 

len = 
Sngl 

40 >2815519 
len = 
Sngl 

45 

>2815519 

len = 

5 0 Term 
Init 

>2815519 

55 len = 

Term 
Init 



/11560 

1777 nex = 

61717 61829 

61954 62030 

62992 63183 

63276 63493 

/113946 

504 nex = 

62991 63183 
63276 63494 

/36710 

414 nex = 

76124 75711 

/14108 

1046 nex = 

76756 75711 

738663 

1041 nex = 

76756 75716 

/11318 

1438 nex = 

76754 75679 
77108 77052 

/38261 

1484 nex = 

76754 75635 
77108 77052 



60 >2815519 



/25772 
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Sngl 77108 77052 



2110 



Init 
Intr 
intr 
Intr 
Term 



89813 
90317 
90587 
91091 
91508 



89995 
90488 
91007 
91156 
91915 



len = 

Term 
Intr 
Intr 
Intr 
Intr 
Intr 
Intr 
Intr 
Intr 
Intr 
Intr 
Init 



2950 nex = 



92911 
93238 
93386 
93536 
93725 
94016 
94196 
94389 
94796 
94970 
95139 
95447 



92505 
92998 
93324 
93468 
93642 
93811 
94106 
94285 
94722 
94878 
95064 
95254 



>2827513 

Term 
Intr 
Init 



16992 16242 
17554 17075 
18395 18232 



>2827513 

len = 

Term 
Intr 
Init 



23483 22952 
24280 23812 
24726 24647 



>2827513 
len = 



2470 nex = 



Init 
Intr 
Intr 
Intr 
Intr 
Intr 
Term 



26839 
27027 
27164 
27310 
27467 
27635 
27850 



26915 
27083 
27220 
27350 
27538 
27714 
28215 
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>2827513 
len = 
5 Sngl 
>2827513 

10 

Init 
Intr 
Intr 
Intr 

15 Intr 
Intr 
Intr 
Term 

20 >2827513 



Sngl 
>2827513 



Init 
Intr 
Intr 
Intr 
Intr 
Intr 
Intr 
Intr 
Intr 
Intr 
Intr 
Term 

>2827513 

len = 

Init 
Term 

>2827513 
len = 
Sngl 

>2827513 
len = 



/24184 
190 nex = 
28607 28793 
/229 
22 6 9 nex = 



28607 
28967 
29236 
29386 
29960 
30213 
30393 
30566 



28791 
29043 
29291 
29458 
30071 
30286 
30467 
30875 



402 nex = 
42615 43016 



4030 nex ^ 



45284 
45967 
46215 
46380 
46958 
47094 
47288 
47427 
47586 
47783 
48041 
48315 



45583 
46094 
46286 
46493 
47026 
47150 
47341 
47489 
47648 
47938 
48196 
48388 



/19281 

1222 nex = 

48103 48196 
48315 48388 

/3172 

414 nex = 

69669 70082 

/13678 

696 nex = 



60 >2827513 



79280 
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len = 

Sngl 

>2827538 

len = 

Init 
Term 

>2827538 

len = 

Init 
Intr 
Intr 
Intr 
Intr 
Intr 
Intr 
Intr 
Intr 
Term 

>2827538 
len = 
Sngl 

>2827538 

len = 

Init 
Intr 
Term 

>2827538 

len = 

Term 
Intr 
Intr 
Intr 
Intr 
Init 

>2827538 

len = 

Term 
Intr 
Intr 
Intr 



236 nex = 

82731 82496 

/107860 

75 6 nex = 

10127 10535 

10626 10882 

735786 

2746 nex = 

11131 11383 

11755 11809 

11898 12053 

12143 12311 

12391 12455 

12534 12674 

12769 12885 

13024 13380 

13468 13525 

13612 13876 

722353 

17 81 nex = 

22916 24696 

/41062 

1775 nex = 

30074 30181 

31148 31303 

31408 31848 



2012 nex = 

37215 36924 

37385 37302 

37555 37480 

37720 37640 

38064 37811 

38618 38156 

/38360 

2206 nex = 

42538 42224 

42735 42673 

42925 42840 

43152 43023 
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Intr 43349 43258 

Intr 43650 43437 

Init 44429 44185 

5 >2827538 /39135 

len = 19 7 0 nex = 



Term 


48221 


47882 




0 


Intr 


48414 


48382 




0 


Intr 


48655 


48508 




0 


Intr 


49319 


48997 




0 


Init 


49851 


49398 




0 



len = 1958 nex = 

Term 57306 56734 

Intr 57614 57483 

Intr 57766 57699 

Init 58691 58063 



>2827538 



/13832 



Term 
Init 



81460 80806 
82039 81838 



>2827538 

Term 
Intr 
Intr 
Intr 
Intr 
Intr 
Init 



2173 

84745 
84904 
85251 
85601 
85852 
86443 
86734 



nex = 

84562 
84822 
84998 
85349 
85679 
86358 
86552 



Sngl 
>2827644 



/18053 
1301 nex = 
9200 7900 

/40232 
2013 nex = 



Init 
Intr 
Intr 
Intr 
Intr 
Intr 
Term 



50573 
51150 
51431 
51732 
51956 
52222 
52373 



51055 
51288 
51539 
51841 
52103 
52286 
52585 
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Term 59427 58305 
Intr 60290 60054 
Init 60647 60399 



len = 

Init 
Intr 

15 Intr 
Intr 
Intr 
Intr 
Intr 

2 0 Intr 

Term 

>2827698 

25 len = 

Init 
Intr 
Intr 

3 0 Intr 

Intr 
Term 



35 



>2827698 



len = 
SngI 

40 >2827698 
len = 



2027 nex = 

88415 88513 

88598 88726 

88827 88916 

89005 89155 

89262 89371 

89447 89518 

89612 89717 

89802 89996 

90088 90441 

729775 

2250 nex = 

21347 21716 

21801 21847 

21943 22057 

22752 22796 

23036 23131 

23213 23596 

/17160 

1210 nex = 

24108 25313 

736946 

2069 nex = 



Init 28482 28521 

45 Intr 28627 28665 

Intr 28798 29172 

Intr 29262 29317 

Term 29444 29760 

50 >2827698 716111 

len = 1796 nex = 

Init 3992 4615 

55 Intr 4773 4960 

Term 5255 5787 



60 len = 



1117 nex = 



3 
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Init 
Intr 
Term 

>2828180 

len = 

Sngl 

>2828180 

Len = 

Sngl 

>2828180 

len = 

Init 
Intr 
Intr 
Intr 
Intr 
Intr 
Intr 
Intr 
Intr 
Intr 
Intr 
Term 

>2828182 

len = 

Init 
Term 



41115 41311 

41410 41571 

41636 42231 

/9396 

500 nex = 

41747 42246 

/115921 

39 2 nex = 

41841 42232 

/32771 

2958 nex = 

53810 53931 

54276 54407 

54599 54657 

54939 55069 

55154 55239 

55348 55379 

55608 55769 

55869 55941 

36027 56127 

56276 56368 

56471 56517 

56610 56767 

75344 

1193 nex = 

11326 11889 

11959 12518 



>2828182 

len = 

Init 
Term 

>2828182 

len = 

Term 
Intr 
Init 



735527 

1242 nex = 

12894 13457 
13541 14135 

7207083 

1835 nex = 

60046 59494 
60847 60710 
61328 61093 



len = 



1854 nex = 
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Term 
Intr 
Init 

>2828182 
len = 
Sngl 

>2828182 

len = 

Term 
Intr 
Init 

>2828182 

len = 

Init 
Term 

>2828182 

len = 

Init 
Intr 
Intr 
Term 

>2828182 
len = 
Sngl 

>2828183 

len = 

Init 
Intr 
Intr 
Intr 
Intr 
Intr 
Intr 
Term 

>2e28183 

len = 

Sngl 



60046 59509 
60847 60710 
61362 61093 

/158804 

501 nex = 

62845 62345 

/15560 

2329 nex = 

62937 62347 
63667 63187 
64675 64169 

/17411 

1728 nex = 

79315 79968 
80239 81042 

/13813 

1618 nex = 

81708 81785 

81894 82074 

82162 82259 

82353 82976 

/22460 

850 nex = 

85041 85888 

79655 

1611 



15617 
15785 
15960 
16138 
16274 
16428 
16725 
16920 



nex = 

15682 
15864 
16044 
16185 
16342 
16583 
16818 
17227 



/15495 
610 nex = 
2072 1467 



60 >2828183 



/38900 
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2564 



Term 
Intr 
Intr 
Intr 
Intr 
Intr 
Init 



33533 33168 

33682 33616 

33903 33802 

34116 34003 

34399 34226 

35100 34669 

35731 35192 



>2828183 
len = 
Sngl 
>2828184 



/18973 
405 nex = 
39229 38825 

/118036 
65 8 nex = 



Init 
Term 



14494 14634 
14779 15151 



>2828184 
len = 



1978 nex 



Init 
Intr 
Intr 
Intr 
Intr 
Term 



42361 
42866 
43191 
43509 
43775 
43999 



42487 
42902 
43413 
43670 
43897 
44338 



>2828184 
len = 



/17188 
34 68 nex 



Init 
Intr 
Intr 
Intr 
Intr 
Intr 
Intr 
Intr 
Intr 
Intr 
Intr 
Term 



54034 
54348 
54570 
54848 
55623 
55958 
56150 
56372 
56529 
56712 
56959 
57147 



54226 
54434 
54675 
54883 
55674 
56022 
56206 
56444 
56615 
56877 
57068 
57481 



>2828185 
len = 



Init 
Term 



13901 14405 
14538 14992 



60 >2828185 



/14679 
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Sngl 
>2828185 



501 nex = 
19843 20343 
733972 
2385 nex = 



10 Init 
Intr 
Intr 
Intr 
Intr 

15 Intr 
Intr 
Intr 
Term 

20 >2828185 



Init 
Intr 
Intr 
Intr 
Intr 
Intr 
Intr 
Intr 
Term 



>2828185 
len = 



53063 
53686 
53854 
54029 
54183 
54389 
54563 
54744 
54920 



53348 
53760 
53955 
54082 
54275 
54481 
54667 
54842 
55447 



1995 nex = 



53214 
53686 
53854 
54029 
54183 
54389 
54536 
54744 
54920 



53348 
53760 
53955 
54082 
54275 
54481 
54667 
54842 
55208 



nex 



Init 
Intr 
Intr 
Intr 
Intr 
Term 

>2828186 

len = 

Init 
Intr 
Intr 
Term 

>2828186 

len = 



68405 
68622 
68914 
69103 
69384 
69646 



68507 
68658 
68999 
69198 
69546 
69802 



7108599 

1481 nex = 

14950 15033 

15143 15301 

15523 15749 

15975 16194 

720057 

1042 nex = 



Term 42331 42168 
Init 43209 43006 

60 
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>2828186 
len = 
5 Sngl 
>2828186 
len = 

10 

Sngl 
>2828186 
15 len = 

Sngl 
>2828186 
len = 



20 



Init 
Intr 

2 5 Term 
>2828187 
len = 

30 

Term 
Init 



len = 

Term 
Intr 

4 0 Intr 
Intr 
Intr 
Init 

45 >2828187 
len = 
Term 

50 Intr 
Intr 
Intr 

Init 

55 >2828187 
len = 



43436 44319 

736363 

1150 nex = 

53511 53675 

/117347 

791 nex = 

60600 61390 

/6461 

859 nex = 

71764 72268 
72381 72437 
72534 72612 

/23099 

1114 nex = 

64086 63766 
64879 54636 

/9905 

1351 nex = 

72205 71968 

72362 72299 

72523 72492 

72708 72611 

73206 73071 

73318 73233 

/91844 

1425 nex = 

72205 71961 

72362 72299 

72523 72492 

72708 72611 

73206 73071 

/142899 

1418 nex = 



Term 72205 71968 
60 Intr 72362 72299 
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Intr 
Intr 
Init 

5 >2828188 

len = 

Init 

10 Intr 
Term 

>2828188 

15 len = 

Init 
Intr 
Intr 

2 0 Intr 

Intr 
Intr 
Intr 
Intr 

25 Intr 
Term 

>2828188 

3 0 len = 

Sngl 
>2828188 



35 



40 



len 



Init 
Term 



>2828188 
len = 
45 Sngl 
>2828278 
len = 

50 

Sngl 

>2828278 

55 len = 

Term 
Init 



72523 72492 
72708 72611 
73206 73071 

/17485 

879 nex = 

27707 27799 
27901 28043 
28241 28522 

/1983 

2312 nex = 

30424 30603 

30707 30804 

30896 30993 

31098 31147 

31561 31656 

31753 31842 

31914 32006 

32089 32184 

32288 32353 

32425 32735 

72949 

506 nex = 

37555 37050 

/34409 

1630 nex = 

49022 49121 
49382 50625 

/101070 

672 nex = 

68221 68889 

/93971 

630 nex = 

20720 20517 

/9641 

1690 nex = 

25562 25106 
26162 26038 



60 >2828278 



/3981 
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Term 28518 27857 

5 Intr 28679 28602 

Init 29197 28762 

>2828278 /101094 

10 len = 261 nex = 

Sngl 47484 47744 

>2828278 736576 

15 

len = 2509 nex = 

Term 57030 56730 



Intr 57325 

Intr 57571 

Intr 57973 

Intr 58172 

Intr 58336 

Init 59238 



57174 
57416 
57688 
58056 
58274 
58896 



3 0 Init 

Term 

>2832611 

35 len = 

Init 
Intr 
Intr 

4 0 Intr 

Term 

>2832611 

4 5 len = 

Init 
Intr 
Intr 

5 0 Term 

>28326il 
len = 

55 

Init 
Intr 
Intr 
Intr 

60 Intr 



12902 13264 
14896 15747 



217 4 nex = 

26411 26644 

26778 27002 

27714 27932 

28070 28131 

28235 28584 

78768 

1456 nex = 

29030 29347 

29729 29811 

29893 30007 

30105 30485 

738712 

2988 nex = 

35656 36292 

36401 36475 

36567 36662 

36750 36835 

36921 37041 
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Intr 
Intr 
Intr 
Intr 
Intr 
Intr 
Term 

>2832611 

len = 

Term 
Intr 
Intr 
Intr 
Intr 
Init 

>2832611 

len = 

Term 
Intr 
Intr 
Init 

>2832611 

len = 

Term 
Intr 
Intr 
Init 

>2832639 

len = 

Init 
Intr 
Intr 
Term 

>2832639 
len = 

>2832639 
len = 
Sngl 

>2832639 



37124 
37298 
37467 
37704 
37867 
38095 
38306 



1527 

67967 
68134 
68294 
68482 
68719 
68873 



37203 
37360 
37573 
37789 
37988 
38206 
38643 



nex = 

67742 
68059 
68206 
68429 
68580 
68799 



78996 
1431 ne. 



70354 
70528 
71169 
71686 



70256 
70448 
70723 
71473 



/109513 

19 54 nex = 

82219 81803 

82755 82537 

83404 83180 

83756 83517 

/155370 

80 9 nex = 

13801 13917 

13999 14045 

14121 14217 

14310 14609 

/16226 

1821 nex = 

/10320 

580 nex = 

22887 22331 

724883 



len = 

60 



49 0 nex = 



1 
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Sngl 

>2832639 

len = 

Term 
Intr 
Init 

>2832639 

Sngl 
>2832639 

Init 
Intr 
Intr 
Intr 
Intr 
Intr 
Intr 
Intr 
Intr 
Intr 
Intr 
Intr 
Intr 
Intr 
Term 

>2832639 
len = 
Sngl 

>2832639 

len = 

Term 
Intr 
Init 

>2832639 

len = 

Init 
Intr 
Intr 
Term 



28760 29244 

/14794 

1378 nex = 

29423 29196 
29908 29796 
30573 30411 

/907 

8 02 nex = 



2754 

52225 
52507 
52668 
52853 
53005 
53159 
53319 
53494 
53622 
53843 
53979 
54133 
54315 
54555 
54755 



52432 
52573 
52761 
52909 
53074 
53223 
53400 
53543 
53765 
53913 
54048 
54198 
54473 
54652 
54978 



/15817 

753 nex = 

61848 61096 

/7501 

166 7 nex = 

62268 61093 
62493 62378 
62759 62575 

735285 

850 nex = 

8531 8838 

8913 9050 

9136 9198 

9284 9373 



60 >2832639 



/31734 
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1358 nex 



Term 
Intr 
Intr 
Init 

>2832639 

len = 

Init 
Term 

>2832639 
len = 
Sngl 

>2832667 

len = 

Init 
Term 

>2832667 

len = 

Init 
Term 

>2832667 

len = 

Term 
Intr 
Init 

>2832667 

len = 

Term 
Intr 
Init 

>2832667 

len = 

Init 
Term 



94255 94004 

94442 94354 

94664 94560 

94843 94774 

724295 

926 nex = 

95837 96095 
96192 96466 

/29009 

1136 nex = 

97830 98023 

/113536 

701 nex = 

14746 14852 
14974 15446 

/26401 

881 nex = 

16072 16194 
16286 16952 

/8156 

1771 nex = 

20352 19919 
20953 20899 
21689 21342 

/19552 

1788 nex = 

20352 19913 
20953 20899 
21700 21342 

/4390 

2022 nex = 

24444 24552 
24642 25004 



>2832667 

60 



/20563 
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10 



len = 
Sngl 
5 >2832667 
len = 
Sngl 
>2832667 
len = 



15 



Term 
Intr 
Init 



Term 
Intr 

25 Init 
>2832667 
len = 

30 

Sngl 
>2832667 

3 5 len = 

Init 
Intr 
Intr 

4 0 intr 

Term 

>2832667 

45 len = 

Init 
Intr 
Intr 

5 0 Intr 

Term 

>2832667 
55 len = 

Sngl 



33 7 nex = 

24653 24317 

/33086 

578 nex = 

27162 26726 

/111375 

1337 nex = 

27162 26726 
27443 27248 
28062 27564 

/41007 

1362 nex = 

27162 26726 
27443 27248 
28083 27564 

/6848 
1157 nex = 
3909 2753 

/5132 

1474 nex = 

52118 52255 

52656 52754 

52930 52982 

53135 53208 

53464 53591 

735323 

1474 nex = 

52120 52255 

52656 52754 

52930 52982 

53135 53208 

53464 53593 

/4495 

2114 nex = 

59794 60466 



>2832667 

60 



/2081 
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Init 
Intr 
Term 

>2832667 

len = 

Init 
Intr 
Intr 
Intr 
Term 

>2832667 

len = 

Sngl 

>2832689 

len = 

Sngl 

>2832689 

len = 

Sngl 

>2832689 

len = 

Sngl 

>2832689 

len = 

Init 
Intr 
Intr 
Intr 
Intr 

>2832689 

len = 

Init 
Term 



68491 69116 
69422 69533 
69623 69722 

/12272 

2316 nex = 



68540 
69422 
69623 
69883 
70296 



69116 
69533 
69794 
70085 
70411 



/2673 
320 nex = 
72959 72640 
/2075 
595 nex = 
42824 43418 
/5753 
730 nex = 
42881 43602 
733965 
68 0 nex = 
42885 43564 
734867 
2058 



65633 
65857 
66109 
66295 
66616 
66931 



nex = 

65767 
66031 
66211 
66522 
66834 
67577 



726737 
1390 nex 



6827 6884 
6976 8196 



>2833627 

60 



738715 
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4289 



Init 
Intr 
Intr 
Intr 
Intr 
Intr 
Intr 
Term 



18099 
18434 
18921 
19215 
19412 
19862 
20099 
20282 



18163 
18574 
18969 
19331 
19465 
19950 
20204 
20669 



>2833627 
len = 



2745 nex 



Init 
Intr 
Intr 
Intr 
Intr 
Intr 
Intr 
Term 



18099 
18434 
18921 
19215 
19412 
19862 
20099 
20282 



18163 
18574 
18969 
19331 
19465 
19950 
20204 
20669 



2757 



nex 



Init 
Intr 
Intr 
Intr 
Intr 
Intr 
Intr 
Term 



18099 
18434 
18921 
19215 
19412 
19862 
20099 
20282 



18163 
18574 
18969 
19331 
19465 
19950 
20204 
20681 



2111 



Init 
Intr 
Intr 
Intr 
Intr 
Intr 
Intr 
Term 

>2842474 

len = 



67181 
67542 
67706 
68112 
68331 
68482 
68837 
69054 



67444 
67618 
67828 
68216 
68403 
68749 
68966 
69291 



/10704 
825 nex ' 



Init 
Intr 
Term 



21629 21886 
21961 21999 
22252 22453 



>2842474 

60 



/5117 
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len = 
Sngl 
5 >2842474 
len = 



2068 



nex ■■ 



Init 
Intr 
Intr 
Intr 
Intr 
Intr 
Term 

>2842474 

len = 



35358 
36030 
36287 
36434 
36679 
36860 
37010 



35949 
36183 
36350 
36537 
36748 
36925 
37425 



Init 
Term 



37554 37887 
38343 38834 



len ■ 



3084 nex 



Term 
Intr 
Intr 
Intr 
Intr 
Intr 
Intr 
Intr 
Intr 
Intr 
Intr 
Intr 
Intr 
Intr 
Intr 
Init 



39452 
39647 
39923 
40163 
40346 
40503 
40671 
40868 
41029 
41151 
41302 
41481 
41624 
41822 
42001 
42223 



39140 
39559 
39818 
40038 
40261 
40450 
40608 
40773 
40955 
41117 
41233 
41383 
41597 
41711 
41911 
42084 



>2842474 
len = 



/40688 
2028 nex = 



Term 
Intr 
Intr 
Intr 
Init 



62730 62375 

63029 62817 

63288 63118 

63523 63407 

63946 63768 



>2842474 
len = 



735774 
4104 nex 



Term 68866 68144 
60 Intr 70164 69899 
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Init 

>2842474 

len = 

Term 
Intr 
Intr 
Intr 
Intr 
Init 

>2853071 

len = 

Init 
Intr 
Term 

>2853071 
len = 
Sngl 

>2853071 

len == 

Init 
Intr 
Intr 
Intr 
Intr 
Intr 
Intr 
Term 

>2853071 
len = 
Sngl 

>285307I 

len = 

Term 
Init 

>2864607 

len = 



72247 71953 
/18360 
1882 nex = 



7149 
7371 
7647 
7915 
8238 
8592 



6711 
7305 
7537 
7763 
8171 
8366 



/19264 

2650 nex = 

28721 29386 
29475 30150 
30940 31017 

/42503 
1695 nex = 
4329 2635 

/13295 
1851 nex = 



49706 
50003 
50216 
50428 
50616 
50822 
50994 
51220 



49914 
50111 
50338 
50525 
50742 
50907 
51119 
51556 



/24087 

799 nex = 

74493 75291 

/143232 

68 9 nex = 

76634 76317 
77005 76728 

/34819 

1657 nex = 



Init 11411 11849 
Intr 12198 12373 
60 Term 12788 13067 
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>2864607 
len = 

5 

Sngl 

>2864607 

10 len = 

Init 
Term 

15 >2864607 
len = 
Init 

2 0 Term 

>2854607 
len = 

25 

Term 
Intr 
Init 

30 >2864607 
len = 
Term 

3 5 Intr 

Intr 
Init 



40 



>2864607 



len = 
Sngl 

45 >2B80038 
len = 
Term 

50 Intr 
Intr 
Intr 
Intr 
Intr 

55 Init 



/31896 

490 nex = 

35491 35976 

/20798 

1016 nex = 

50921 51565 
51635 51936 

/30539 

995 nex = 

50974 51566 
51635 51968 

/3031D 

779 nex = 

56737 56505 
56871 56822 
57157 56983 

/29964 

877 nex = 

56737 56470 

56871 56822 

57157 56983 

57346 57251 

/115583 

1150 nex = 

91469 90322 

/39751 

2230 nex = 

33273 33017 

33540 33439 

33821 33808 

34285 34177 

34463 34390 

34593 34550 

34847 34762 



>2880038 



/20095 



len = 

60 
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Term 


33273 


32976 


- 


Intr 


33540 


33439 




Intr 


33821 


33808 




Intr 


34285 


34177 




Intr 


34463 


34390 




Intr 


34593 


34550 




Intr 


34847 


34762 




Init 


35271 


35063 





>2880038 
len = 
Sngl 

>2B94557 
len = 



/31304 
310 nex = 
75 382 

798545 
953 nex = 



Term 
Init 



11319 10903 
11855 11346 



>2894557 
len = 



/14645 
1007 nex ^ 



Term 
Intr 
Init 

>2894557 

len = 



11187 10877 
11502 11346 
11883 11602 



1180 



Term 
Intr 
Intr 
Intr 
Init 



13853 
14042 
14172 
14432 
14735 



13556 
14002 
14109 
14276 
14527 



>2894557 
len = 



1059 nex 



Term 
Intr 
Init 

>2894557 

len = 



17997 17644 
18258 18102 
18702 18484 

/16643 

1411 nex = 



Term 
Intr 
Init 



26860 26567 
27098 26942 
27977 27679 



len = 

60 



1369 nex = 



3 



Reference No. 2750-942P 



Init 
Intr 
Term 

>2894557 
len = 
Sngl 

>2894557 

len = 

Term 
Init 



28664 28756 
29030 29218 
29355 29739 

/4764 
610 nex = 
30727 30125 

723223 

670 nex = 

30491 30129 
30795 30617 



1931 



nex ■ 



Term 
Intr 
Intr 
Intr 
Intr 
Init 

>2894557 

len = 

Term 
Intr 
Intr 
Intr 
Intr 
Intr 
Intr 
Init 

>2894557 
len = 
Sngl 

>2894557 

len = 

Term 
Intr 
Intr 
Init 



1991 1564 

2137 2084 

2311 2228 

2623 2409 

3127 3028 

3494 3369 



/17340 



5636 

52206 
52347 
52539 
52749 
53083 
53233 
54012 
57542 



nex = 

51907 
52286 
52453 
52631 
52845 
53177 
53381 
57102 



732972 

328 nex = 

73075 72748 

733172 

2020 nex = 

73486 72801 

74137 73972 

74427 74239 

74820 74549 

720758 



len 

60 



1959 nex = 



4 
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Term 
Intr 
Intr 
Init 



73486 72853 

74137 73972 

74427 74239 

74821 74549 

737898 



Init 
Intr 
Intr 
Intr 
Intr 
Intr 
Term 



48 



263 
421 
624 



125 
346 
502 
715 858 
975 1008 
1090 1361 



Sngl 
>2894591 



/4613 
818 nex = 
11818 12635 
/12013 



Init 
Intr 
Term 



29017 29255 
29840 30037 
30126 30344 



Term 
Intr 
Intr 
Intr 
Intr 
Intr 
Intr 
Intr 
Intr 
Init 

>2894591 

len = 



1518 
1688 
1839 
2024 
2182 
2336 
2511 
2749 
3311 
3418 



1201 
1600 
1768 
1922 
2114 
2274 
2415 
2598 
2863 
3345 



733385 
1224 nex ^ 



Init 
Intr 
Term 



49463 
49539 49711 
49804 50111 



>2894591 
len = 



713507 
1610 nex 



Init 53993 54524 
60 Term 55134 55602 
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>2914688 

len = 

Init 
Term 

>2914688 
len = 
Sngl 

>2914688 

len = 

Term 
Init 

>2914688 

len = 

Term 
Init 

>2914688 

len = 

Sngl 

>2914688 

len = 

Sngl 

>2914688 

len = 

Sngl 

>2914688 

len = 

Sngl 

>2914688 

len = 

Sngl 

>2914688 



/27602 

827 nex = 

32845 33192 
33282 33671 

/27570 

500 nex = 

3483 3969 

/30472 

6 50 nex = 

43833 43650 
44299 43939 

/39917 

1453 nex = 

63880 62989 
64441 63995 

/7865 

398 nex = 

74566 74860 

/154660 

4 06 nex = 

75023 75428 

73956 

380 nex = 

75051 75430 

/146794 

202 nex = 

75123 75324 

723742 

98 7 nex = 

87641 87085 

728031 
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len = 
Sngl 
>2924505 
len = 



283 0 nex 



Term 
Intr 
Intr 
Intr 
Intr 
Intr 
Intr 
Intr 
Init 



21014 
21499 
21723 
21889 
22051 
22205 
22801 
23090 
23467 



20638 
21397 
21644 
21807 
22003 
22140 
22752 
22966 
23242 



>2924505 
len = 



/19273 
1736 nex ■■ 



Init 
Intr 
Intr 
Intr 
Intr 
Intr 
Term 



28770 
29392 
29564 
29776 
29938 
30084 
30249 



29087 
29483 
29673 
29839 
30002 
30149 
30505 



>2924505 
len = 



/265 
850 nex 



Term 
Init 



42073 
42383 



41536 
42158 



>2924505 
len = 



184 9 nex 



Init 
Term 



54147 54502 
55034 55995 



>2924505 
len = 



/39154 
1631 nex ■ 



Init 
Intr 
Intr 
Term 



64280 64770 

64863 65158 

65418 65558 

65652 65910 



>2924505 
len = 



/7108 
1477 nex 



Init 64436 64770 
Intr 64863 65158 
60 Intr 65418 65558 
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>2924651 

len = 

Sngl 

>2924651 

len = 

Sngl 

>2924651 

len = 

Init 
Term 

>2924651 

len = 

Term 
Intr 
Intr 
Intr 
Intr 
Intr 
Intr 
Intr 
Intr 
Intr 
Intr 
Intr 
Intr 
Init 

>2924651 

len = 

Init 
Intr 
Intr 
Term 

>2924651 

len = 

Init 
Intr 
Term 

>2924651 

len = 



/116144 
550 nex = 
11422 11968 
/4228 
1336 nex = 



16012 16108 
16483 17408 



3460 nex 



17854 
17949 
18132 
18319 
18472 
19189 
19363 
19517 
20061 
20202 
20333 
20770 
20948 
21138 



797 



17679 
17883 
18040 
18226 
18400 
19124 
19286 
19431 
19866 
20146 
20276 
20446 
20865 
21036 



nex 



27009 27125 
27212 27285 
27349 27473 
27545 27805 

/101608 

7 98 nex = 

27010 27125 
27212 27473 
27545 27807 

/11215 

1990 nex = 
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Term 
Intr 
Intr 
Init 



40287 39899 

40599 40363 

41110 40743 

41882 41190 



>2924651 
len = 



/124621 



1394 nex 



Init 
Intr 
Intr 
Intr 
Term 



42248 42476 

42705 42841 

42918 42975 

43069 43146 

43242 43641 



>2924651 
len = 



/115662 
1586 nex 



Term 
Intr 
Init 



54089 53651 
54599 54185 
55236 54985 



>2924651 



989 



Init 
Intr 
Intr 
Intr 
Term 



64216 
64498 
64656 
64819 
64968 



64414 
64577 
64726 
64872 
65204 



>2924651 
len = 



Term 
Init 



67750 67536 
68055 67838 



>2924651 
len = 



Term 
Intr 
Intr 
Intr 
Intr 
Intr 
Init 



68218 
68390 
68583 
68745 
68963 
69280 
69566 



58177 
68313 
68492 
68667 
68830 
69052 
69388 



>2924651 
len = 



/18901 
2963 nex 



Term 73319 73016 
Intr 73588 73415 
60 Intr 73756 73694 
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Intr 
Intr 
Intr 
Intr 
Intr 
Intr 
Intr 
Intr 
Init 

>2924651 

len = 



Init 
Term 

>2924652 

len = 

Term 
Intr 
Init 

>2924652 

len = 

Term 
Intr 
Init 

>2924652 

Term 
Intr 
Init 



73960 
74160 
74372 
74635 
74806 
75007 
75150 
75362 
75978 



73853 
74045 
74246 
74466 
74728 
74896 
75062 
75232 
75624 



/111348 

681 nex = 

79079 79259 
79349 79759 

/4911 

1030 nex = 

19435 19201 
19824 19635 
20222 20090 

/36891 

1540 nex = 

34140 33486 
34498 34229 
35025 34629 

/6935 

182 0 nex = 

34140 33312 
34498 34229 
35131 34629 



>2924653 
len = 
Sngl 

>2924653 

len = 

Init 
Intr 
Intr 
Intr 
Intr 
Intr 
Term 



/40303 
2 207 nex -■ 



48858 
49281 
49695 
50013 
50352 
50538 
50749 



49136 
49579 
49920 
50258 
50462 
50669 
51064 



60 >2924653 



/37280 
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Init 


5790 


5865 


+ 


D 


Intr 


6335 


6391 


+ 


0 


Intr 


6473 


6632 


+ 


0 


Intr 


6769 


6851 


+ 


0 


Intr 


6950 


7009 


+ 


0 


Intr 


7106 


7214 


+ 


0 


Intr 


7290 


7342 


+ 


0 


Intr 


7437 


7517 


+ 


0 


Intr 


7595 


7707 


+ 


0 


Term 


7811 


8147 


+ 


0 



>2924653 
len = 



/10022 
2002 nex ■■ 



Init 
Intr 
Intr 
Term 

>2924653 

len = 



64053 64145 

64317 64444 

64584 64688 

65141 65503 

/12547 

2003 nex = 



Init 
Intr 
Intr 
Term 



64053 64145 

64317 64444 

64584 64688 

65141 65504 



>2924653 
len = 



/6841 
134 5 nex 



Init 
Intr 
Term 



70235 70473 
70569 70632 
71064 71579 



>2924653 
len = 



722448 
1249 nex 



Init 
Intr 
Term 

>2924653 

len = 



70377 70473 
70569 70632 
71064 71625 

/92808 

850 nex = 



Term 
Intr 
Init 



74646 74511 
74845 74760 
75351 74975 



>2924654 
len = 



/30054 
2574 nex 
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Init 
Intr 
Intr 
Intr 
Intr 
Intr 
Term 



len = 

Init 
Intr 
Intr 
Intr 
Intr 
Term 



Init 
Intr 
Intr 
Intr 
Intr 
Intr 
Term 



14617 
15035 
15355 
15717 
15879 
16116 
16657 



14705 
15246 
15634 
15788 
15995 
16181 
16957 



/37363 
2618 nex 



14617 
15355 
15717 
15879 
16116 
16657 



15246 
15634 
15788 
15995 
16181 
17013 



/41320 
2533 nex 



14422 
15035 
15355 
15717 
15879 
16116 
16657 



14705 
15246 
15634 
15788 
15995 
16181 
16954 



>2924654 
len = 
Sngl 

>2924654 

len = 

Init 
Intr 
Intr 
Intr 
Term 

>2924655 

len = 

Term 
Intr 
Intr 
Intr 
Intr 
Intr 
Intr 
Intr 
Intr 



997 



nex ■ 



29862 28866 

737399 

2110 nex = 

41781 42184 

42289 42366 

42453 42522 

42650 42927 

43021 43108 

/33830 

2316 nex = 



18451 
18825 
19016 
19270 
19331 
19690 
19925 
20137 
20329 



18193 
18612 
18919 
19142 
19289 
19580 
19779 
20024 
20200 
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Init 
>2924655 

Init 
Intr 
Term 

>2924728 
len = 
Sngl 

>2924728 

len = 

Init 
Intr 
Intr 
Intr 
Intr 
Intr 
Intr 
Intr 
Intr 
Intr 
Intr 
Term 

>2924728 

Sngl 

>2924728 

len = 

Term 
Init 

>2924728 
len = 
Sngl 

>2924728 
len = 



20508 20379 

/37918 

799 nex = 

34127 34257 
34452 34520 
34618 34925 

/112173 

321 nex = 

16725 16405 

/20919 

3224 



39629 
40038 
40259 
40438 
40669 
40904 
41217 
41480 
41825 
42080 
42327 
42476 



nex = 

39916 
40180 
40351 
40554 
40764 
41017 
41270 
41527 
41975 
42245 
42396 
42852 



/15599 

29 0 nex = 

46957 46668 

/32450 

2230 nex = 

47403 46673 
48900 48211 

/11304 

1048 nex = 

4497 3866 

/11830 

740 nex = 



60 



Term 
Intr 
Init 



56509 56324 
56849 56595 
57063 56929 
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>2924729 
len = 
Sngl 

>2924729 

len = 

Term 
intr 
Init 

>2924729 

Sngl 
>2924729 

Term 
Intr 
Init 

>2924729 

len = 

Term 
Intr 
Init 

>2924730 

len = 

Sngl 

>2924730 

len = 

Sngl 

>2924731 

len = 

Init 
Intr 
Intr 
Intr 
Intr 
Intr 
Intr 
Term 



732977 

5 97 nex = 

32483 31887 

/8150 

1108 nex = 

37394 37188 
37888 37601 
38295 37976 

/108595 

698 nex = 

41180 40483 

/27221 

1539 nex = 

60702 60586 
61397 61275 
61782 61669 

/12817 

1994 nex = 

60702 60586 
61397 61275 
62241 61669 

/14419 

559 nex = 

3605 3059 

/9214 

328 nex = 

8331 8658 

/34161 

2210 nex = 



3351 
3663 
3839 
4037 
4172 
4652 
4794 
5021 



3568 
3747 
3906 
4087 
4245 
4710 
4870 
5301 



Reference No. 2750-942P 



>2924732 

len = 

Term 
Intr 
Intr 
Intr 
Intr 
Init 

>2924732 

len = 

Term 
Init 

>2924732 

len - 

Term 
Intr 
Intr 
Intr 
Intr 
Init 

>2924732 

len = 

Sngl 

>2924732 

len = 

Sngl 

>2924732 

len = 



15136 
15419 
15598 
15788 
16145 
16299 



nex = 

14749 
15211 
15512 
15678 
16078 
16228 



/118484 

1335 nex = 

17876 17178 
18512 18040 

/110980 

1726 nex = 

2588 2145 

2744 2665 

2992 2845 

3155 3095 

3489 3265 

3870 3763 

/10411 

770 nex = 

43741 42972 

/19177 

829 nex = 

45210 44382 

737398 

2350 nex = 



Term 
Intr 
Intr 
Intr 
Intr 
Intr 
Intr 
Intr 
Init 



72077 
72317 
72514 
72722 
73197 
73331 
73501 
73771 
74137 



71790 
72183 
72395 
72612 
72982 
73278 
73430 
73595 
73854 



len = 

60 



573 nex = 



2 
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Term 


12077 


11784 




0 


Init 


12356 


12172 


- 


0 


>2924733 


/30414 






len = 


397 


nex = 


2 




Init 


2448 


2519 


+ 


0 


Term 


2615 


2844 


+ 


0 


>2924733 


/147492 






len = 


621 


nex = 


1 




Sngl 


38573 


37953 




0 


>2924733 


78989 






len = 


2384 


nex = 


8 




Init 


465 


737 


+ 


0 


Intr 


820 


941 


+ 


0 


Intr 


1362 


1477 


+ 


0 


Intr 


1616 


1689 


+ 


0 


Intr 


1800 


2064 


+ 


0 


Intr 


2148 


2279 


+ 


0 




2384 


2519 






Term 


2615 


2848 






>2924733 


/20896 






len 


2392 


nex = 






Init 


465 


737 


+ 


0 


Intr 


820 


941 


+ 


0 


Intr 


1362 


1477 


+ 


0 


Intr 


1616 


1689 


+ 


0 


Intr 


1800 


2064 


+ 


0 


Intr 


2148 


2279 


+ 


0 




2384 


2519 






Term 


2615 


2856 






>2924733 


725282 






len 


2410 


nex = 






Init 


465 


737 


+ 


0 


Intr 


820 


941 


+ 


0 


Intr 


1362 


1477 


+ 


0 


Intr 


1616 


1689 


+ 


0 


Intr 


1800 


2064 


+ 


0 


Intr 


2148 


2279 


+ 


0 


Intr 


2384 


2519 


+ 


0 


Term 


2615 


2868 


+ 


0 


>2924733 


715239 






len = 


976 




1 




Sngl 


47202 


46227 




0 
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len = 


1100 


nex = 


1 




Sngl 


51162 


52261 


+ 


0 


>2924733 


739485 






len = 


2050 


nex = 


6 




Term 


57509 


57170 




0 


Intr 


57700 


57603 




0 


Intr 


57992 


57801 


_ 


0 


Intr 


58319 


58246 


- 


0 


Intr 


58474 


58387 




0 


Init 


59210 


58968 


- 


0 


>2924733 


/28716 






len = 


2215 


nex = 


3 




Init 


9164 


9224 


+ 


0 


Intr 


9724 


10187 


+ 


0 


Term 


10288 


11378 


+ 


0 


>2924768 


/954 






len = 


1171 


nex = 


3 




Term 


20675 


20418 




0 


Intr 


20828 


20757 


- 


0 


Init 


21588 


21272 




0 


>2924768 


/22220 






len = 


970 


nex = 


2 




Term 


33367 


33014 


_ 


0 


Init 


33981 


33703 


_ 


0 


>2924768 


736488 






len = 


3370 


nex = 


14 




Init 


49401 


49566 


+ 


0 


Intr 


49979 


50055 


+ 


0 


Intr 


50164 


50248 


+ 


0 


Intr 


50331 


50457 


+ 


0 


Intr 


50547 


50645 


+ 


0 


Intr 


50719 


50786 


+ 


0 


Intr 


50867 


50974 


+ 


0 


Intr 


51064 


51141 


+ 


0 


Intr 


51259 


51391 


+ 


0 


Intr 


51486 


51582 


+ 


0 


Intr 


51750 


51873 


+ 


0 


Intr 


51966 


52074 


+ 


0 


Intr 


52362 


52445 


+ 


0 


Term 


52537 


52762 


+ 


0 



Reference No. 2750-942P 



1070 

>2924768 /2443 



len = 2009 nex = 5 

5 Init 54917 55312 + 0 

Intr 55789 55930 + 0 

Intr 56024 56144 + 0 

Intr 56337 56399 + 0 

Term 56526 56925 + 0 

10 

>2924768 /40107 

len = 1793 nex = 2 

15 Init 55097 55930 + 0 

Term 56024 56889 + 0 

>2947056 /35441 

20 len = 1884 nex = 7 

Term 24625 24082 - 0 

Intr 24833 24712 - 0 

Intr 25012 24909 - 0 

25 Intr 25146 25116 - 0 

Intr 25374 25244 - 0 

Intr 25569 25472 - 0 

Init 25953 25760 - 0 

30 >2947056 /18146 

len = 884 nex = 5 

Term 52485 52355 - 0 

35 Intr 52728 52640 - 0 

Intr 52902 52813 - 0 

Intr 53019 52994 - 0 

Init 53238 53111 - 0 

40 >2961335 /19666 

len = 970 nex = 3 

Init 100334 100454 + 0 

45 Intr 100557 100713 + 0 

Term 100813 101289 + 0 

>2961335 /42504 

50 len = 1641 nex = 4 

Term 101428 101392 - 0 

Intr 102385 102247 - 0 

Intr 102589 102521 - 0 

55 Init 103032 102688 - 0 

>2961335 /37217 

len = 1635 nex = 4 

60 



Reference No. 2750-942P 



Term 
Intr 
intr 
Init 

>2961335 
len = 
Sngl 

>2961335 

len = 

Term 
Intr 
Intr 
Init 

>2961335 
len = 
Sngl 

>2961335 

len = 

Term 
Intr 
Intr 
Intr 
Intr 
Intr 
Intr 
Intr 
Init 

>2961370 

len = 

Init 
Term 

>2961370 
len = 
Sngl 

>2961370 
len = 



101428 101398 

102385 102247 

102589 102521 

103032 102688 

739963 

889 nex = 

31532 30644 

736866 

2013 nex = 

31944 30671 

32165 32030 

32386 32252 

32683 32529 

721244 
1212 nex = 
33841 33295 

7322 
2495 nex = 



85172 
85456 
85703 
85906 
86542 
86714 
87177 
87304 
87546 



85058 
85295 
85608 
85842 
86467 
86632 
86849 
87258 
87386 



714792 

1218 nex = 

2148 2639 
2907 3365 

741090 

271 nex = 

3095 3365 

736110 

2782 nex = 



Init 31733 31968 
Intr 32075 32200 
60 Intr 32291 32450 
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Intr 
Intr 
Intr 
Intr 
Intr 
Intr 
Intr 
Intr 
Term 

>2961370 

len = 

Sngl 

>2961370 

len = 

Sngl 

>2979540 

len = 



32544 32729 

32831 32947 

33047 33100 

33180 33251 

33395 33579 

33685 33779 

33862 33936 

34061 34144 

34259 34514 

/9402 

32 9 nex = 

52923 53251 

/12881 

37 5 nex = 

88516 88142 

/12830 

1150 nex = 



Term 
Init 



31157 30569 
31718 31304 



>2979540 
len = 
Sngl 

>2979540 
len = 



/11908 
238 nex = 
43116 43353 
/9990 
1645 nex = 



Term 
Intr 
Intr 
Init 



53451 53377 

53944 53849 

54297 54187 

54719 54557 



>2979540 
len = 



75467 
32 67 nex 



Term 
Intr 
Intr 
Intr 
Intr 
Intr 
Init 



61277 60884 

61544 61503 

61708 61667 

61911 61812 

62080 62019 

62353 62272 

63560 63356 



/100354 



len = 1123 nex = 

60 



3 



Reference No. 2750-942P 



Term 
Intr 
Init 

>2979540 

len = 



68542 68068 
68676 68629 
69047 68997 

/18662 

1011 nex = 



Init 
Intr 
Term 

>2979540 

len = 



84864 84965 
85100 85128 
85223 85515 

/9848 

478 nex = 



Init 
Intr 
Term 



84863 84965 
85100 85128 
85223 85340 



>2979540 
len = 



/123760 
631 nex 



Term 
Intr 
Init 



9120 9002 
9356 9289 
9632 9459 



>2979540 
len = 
Sngl 

>2980757 
len = 



735578 
976 nex = 
9822 10797 

/96478 
628 nex = 



Term 
Init 



101920 101603 
102230 102001 



>2980757 
len = 



/29888 
1706 nex 



Init 
Term 



105420 106491 
106578 106835 



>2980757 
len = 
Sngl 

>2980757 
len = 



/20759 
1674 nex = 
111308 109635 
/32640 
19 75 nex = 



Term 
Intr 



113953 113573 
114107 114042 



Reference No. 2750-942P 



Intr 114352 114249 
Intr 114505 114432 
Init 115547 115405 



len 



2050 nex ■■ 



Term 113953 113573 

Intr 114107 114042 

Intr 114352 114249 

Init 114505 114432 

80757 739459 



len 



2077 nex 



Term 
Intr 
Intr 
Init 



113953 113570 

114107 114042 

114352 114249 

114505 114432 



/20888 



len ■■ 



3039 nex 



Term 
Intr 
Intr 
Intr 
Intr 
Intr 
Intr 
Intr 
Init 



116278 
116459 
116906 
117244 
117420 
117593 
117776 
117973 
118656 



115618 
116365 
116773 
116981 
117335 
117501 
117681 
117866 
118366 



>2980757 
len = 



Term 
Intr 
Init 



19947 19431 
20153 20130 
20797 20451 



45 >2980757 



1891 



nex ■ 



Term 
Intr 
Intr 
Intr 
Intr 
Intr 
Init 



22249 22095 

22401 22353 

22603 22491 

22808 22697 

23041 22956 

23548 23465 

23985 23709 



/40278 



len = 

60 



2700 nex = 



Reference No. 2750-942P 



Init 
Intr 
intr 
Intr 
Intr 
Intr 
Intr 
Intr 
Intr 
Term 



62319 
62599 
62935 
63261 
63478 
63712 
63933 
64204 
64409 
64665 



62472 
62662 
63182 
63404 
63629 
63834 
64110 
64322 
64523 
65018 



>2980757 



1620 



nex ■■ 



Init 
Intr 
Intr 
Intr 
Term 



70442 70911 

70996 71076 

71232 71333 

71626 71700 

71818 72061 



/31287 



nex ■■ 



Init 
Intr 
Intr 
Intr 
Intr 
Intr 
Term 



8024 
9004 
9190 
9500 
9732 
9974 
10228 



8864 
9089 
9363 
9614 
9829 
10109 
10447 



len ■■ 



727462 
2028 nex 



Term 
Intr 
Intr 
Init 



46006 45949 

46321 46080 

46522 46410 

47415 47305 



/39073 



1651 



Init 
Intr 
Intr 
Intr 
Term 



15971 16170 

16291 16381 

16472 17062 

17134 17233 

17529 17621 



/15729 
1699 nex 



Init 
Intr 
Intr 
Intr 



15980 16150 

16291 16381 

16472 17062 

17134 17233 



Reference No. 2750-942P 





Term 


17529 17678 


+ 


0 




>3004543 


/1314 






5 


len = 


1330 nex = 


5 






Init 


21535 21572 


+ 


0 




Intr 


21645 21756 


+ 


0 




Intr 


21837 21934 


+ 


0 


10 


Intr 


22311 22407 


+ 


0 




Term 


22482 22571 


+ 


0 




>3004543 


/3350 






15 


len = 


1450 nex = 


3 






Term 


44799 44453 




0 




Intr 


45325 45228 


~_ 


0 




Init 


45894 45618 


- 


0 


20 












>3004543 


736487 








len = 


770 nex = 


2 




25 


Term 


45325 45258 




0 




Init 


46027 45618 




0 




>3004543 


/16223 






30 


len = 


1140 nex = 


4 






Init 


78104 78275 




0 




Intr 


78360 78542 




0 




Intr 


78629 78810 


+ 


0 


35 


Term 


78902 79243 


+ 


0 




>3021263 


/1667 








len = 


8 68 nex = 


1 




40 












Sngl 


23858 22991 




0 




>3033373 


/21342 






45 


len = 


74 7 nex = 


1 






Sngl 


15077 15823 




0 




>3033373 


/29605 






50 












len = 


595 nex = 


1 






Sngl 


39454 40045 




0 


55 


>3033373 


/6606 








len = 


1090 nex = 


1 






Sngl 


79964 78877 




0 
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>3033373 

len = 

Init 
Intr 
Intr 
Intr 
Intr 
Term 

>3033373 

len = 

Init 
Intr 
Intr 
Intr 
Intr 
Intr 
Intr 
Intr 
Intr 
Intr 
Intr 
Intr 
Intr 
Intr 
Term 

>3036791 

len = 

Term 
Intr 
Intr 
Intr 
Intr 
Intr 
Init 

>3036791 

len = 

Sngl 

>3036791 

len = 

Sngl 

>3036791 

len = 

Init 



2003 nex ■■ 



82999 
83637 
83796 
83938 
84308 
84467 



83318 
83702 
83846 
84017 
84352 
85001 



/5509 
3955 nex 



85397 
85852 
85992 
86819 
87036 
87213 
87417 
87611 
87739 
87911 
88122 
88458 
88608 
88830 
89057 



1935 

30641 
30794 
30936 
31426 
31610 
31784 
32314 



85674 
85906 
86143 
86858 
87128 
87328 
87491 
87650 
87821 
87988 
88265 
88525 
88722 
88986 
89351 



nex = 

30380 
30729 
30872 
31363 
31501 
31697 
31905 



/11954 
1033 nex = 
56760 55728 
/103458 
1150 nex = 
58836 59073 
/4329 
1227 nex = 
90374 91016 



Reference No. 2750-942P 



Term 
>3036791 
len = 

Sngl 
>3046847 

Term 
Intr 
Intr 
Intr 
Intr 
Intr 
Intr 
Intr 
Intr 
Init 



91266 91600 
/14578 
88 0 nex = 
96436 97315 
/8184 
3010 



8682 
8887 
9087 
9281 
9673 
9870 
10314 
10511 
10668 
11428 



nex = 

8425 
8808 
8960 
9216 
9622 
9785 
10151 
10395 
10598 
11126 



>3046847 
len = 



/37197 
1990 nex 



Term 
Intr 
Intr 
Intr 
Init 

>3046847 

len = 

Sngl 

>3046847 

len = 

Sngl 

>3046847 



11964 
12305 
12692 
12917 
13535 



11553 
12048 
12466 
12776 
13263 



/19893 
858 nex = 
39899 39042 
/92102 
9 92 nex = 
42005 41014 
/2037 
585 nex = 



Term 
Init 



>3046847 
len = 



Init 
Term 



43375 42957 
43541 43466 



/95135 
1247 nex 



49610 50038 
50471 50856 



60 >3046847 



79324 



Reference No. 2750-942P 



len = 
Sngl 
>3046849 
len = 



4 79 nex = 
54499 54021 
/14126 
1577 nex = 



Term 
Init 



28105 27242 
28795 28556 



>3046849 
len = 



/156665 
839 nex 



Term 
Init 



28105 27980 
28795 28556 



>3046849 
len = 



/3963 
1488 nex ■■ 



Term 
Init 



39835 39089 
40576 40199 



>3046849 
len = 



/13314 
1665 nex 



Init 
Intr 
Intr 
Intr 
Term 

>3046849 

len = 



44226 

44558 44738 

44824 44990 

45085 45167 

45367 45890 

/40824 

13 61 nex = 



Init 
Intr 
Intr 
Intr 
Term 



44387 

44558 44738 

44824 44990 

45085 45167 

45367 45747 



>3046850 
len = 



/32178 
1227 nex 



Term 
Intr 
Intr 
Intr 
Init 



49 1 

218 144 

570 392 

748 681 

1227 972 

/23581 



len = 

60 



1553 nex = 



4 



Reference No. 2750-942P 



Term 
Intr 
Intr 
Init 



26495 26176 

27046 26959 

27255 27139 

27728 27452 



>3046850 
len = 



/4275 



157 0 nex = 



Term 
Intr 
Intr 
Init 

>3046850 

len = 



26495 26189 

27046 26959 

27255 27139 

27752 27452 



/18867 



2066 



nex ■■ 



Term 
Intr 
Intr 
Intr 
Intr 
Init 



35882 35690 

36071 35971 

36216 36177 

36347 36300 

36722 36567 

37087 36940 



>3046850 
len = 



Term 
Init 



/1680 
580 ne 



44554 44138 
44695 44629 



>3046850 



/27167 
1215 nex 



Term 
Intr 
Init 

>3046850 

len = 



44554 44109 
44695 44629 
45323 45131 

/2557 

790 nex = 



Term 
Intr 
Init 



58982 58595 
59222 59091 
59377 59312 



>3046850 

len = 

Init 
Intr 
Intr 
Term 



/21485 

1210 nex = 

68376 68477 

68594 68717 

69144 69216 

69364 69581 

/7347 



60 len = 



1314 nex = 



7 



Reference No. 2750-942P 



Term 
Intr 
Intr 
Intr 
Intr 
Intr 
Init 



69704 
69872 
70063 
70270 
70434 
70647 
70794 



69481 
69789 
69955 
70147 
70338 
70515 
70726 



>3046850 
len - 



3435 



nex = 



Term 
Intr 
Intr 
Intr 
Intr 
Intr 
Intr 
Intr 
Intr 
Intr 
Init 



69704 
69872 
70063 
70270 
70434 
70647 
70803 
70979 
71133 
71700 
71842 



69439 
69789 
69955 
70147 
70338 
70515 
70726 
70872 
71060 
71616 
71778 



>3046851 
len = 



739764 
2129 nex = 



Init 
Term 



>3046851 
len = 



Init 
Term 



16125 17251 
17913 18253 



/19097 
212 9 nex ■■ 



16148 17251 
17913 18276 



>3046851 
len = 



Init 
Term 



/39024 
1270 nex 



17002 17251 
17913 18259 



>3046851 
len = 



/108127 
1035 nex 



Init 
Term 
>3046851 



17225 17251 
17913 18259 
/42919 



Init 
Intr 
Intr 
Intr 



2530 nex = 

23586 23813 

23901 24521 

24672 24859 

25008 25246 
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Intr 
Intr 
Term 



25330 25529 
25634 25759 
25845 26114 



>3046851 

len = 

Init 
Intr 
Intr 
Intr 
Intr 
Term 



1887 

24250 
24672 
25008 
25330 
25634 
25845 



nex = 

24521 
24859 
25246 
25529 
25759 
26136 



>3046851 
len = 
Sngl 

>3046852 
len = 



5153 nex 



Term 
Intr 
Intr 
Intr 
Intr 
Intr 
Intr 
Intr 
Intr 
Intr 
Intr 
Intr 
Init 



5264 
5566 
5753 
6129 
6633 
7388 
7603 
7755 
8145 
8399 
8654 
8799 
9050 



5042 
5461 
5662 
6021 
6544 
7136 
7476 
7696 
8089 
8355 
8573 
8759 
9006 



>3046852 
len = 



Init 
Intr 
Term 



42931 43061 
43292 43357 
43687 43968 



>3046853 
len = 



1821 nex 



Init 
Intr 
Intr 
Intr 
Term 



32540 
32975 
33248 
33646 
33987 



32883 
33099 
33493 
33901 
34360 



60 len = 



1810 nex = 



5 
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Init 
Intr 
Intr 
Intr 
Term 



32589 32883 

32975 33099 

33248 33493 

33646 33901 

33987 34392 



>3046853 
len = 
Sngl 

>3046853 
len = 



/21929 
370 nex = 
34040 34389 
/102017 
1727 nex = 



Init 
Term 



63162 63584 
64256 64888 



>3046853 
len = 



/22470 
1319 nex ' 



Term 
Intr 
Intr 
Init 



69525 69198 

70117 69922 

70247 70205 

70516 70343 



>3046854 



/10299 
1773 nex 



Init 
Term 



19008 19865 
20555 20780 



>3046854 
len = 



/11612 
14 80 nex ■■ 



Init 
Intr 
Term 



21912 22049 
22395 22463 
22582 23391 



>3046854 
len = 



73994 
730 nex = 



Term 
Init 



23612 23317 
24037 23700 



>3046854 
len = 



734699 
1554 nex = 



Init 
Intr 
Term 



38733 39282 
39431 39721 
39804 40286 



60 >3046854 



71816 



Reference No. 2750-942P 



len 



1677 nex 



Init 
Intr 
Intr 
Intr 
Intr 
Term 

>3046854 

len = 

Term 
Intr 
Init 

>3046854 

len = 

Init 
Intr 
Term 

>3046855 

len = 

Sngl 

>3046855 

len = 

Sngl 

>3046855 

len = 

Init 
Intr 
Term 

>3046855 

len = 

Init 
Intr 
Term 

>3046855 

len = 



57210 57371 

57484 57534 

57762 57813 

58215 58273 

58430 58475 

58598 58886 

737233 

1164 nex = 

78862 78599 
79238 78963 
79762 79617 

/38961 

1774 nex = 

9436 9728 
9833 10113 
10198 10667 

/108313 

65 7 nex = 

23798 23142 

732984 

970 nex = 

24087 23142 

/14710 

103 7 nex = 

47198 47333 
47668 47769 
47850 48073 

/143364 

697 nex = 

47201 47333 
47668 47769 
47850 47897 

/115178 

1558 nex = 



Init 51391 51541 
60 Intr 51756 51855 
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Intr 
Intr 
Term 

>3046855 

Len = 

Term 
Intr 
Init 

>3046855 

len = 

Term 
Init 

>3046855 

len = 

Term 
Intr 
Init 

>3046855 

len = 

Term 
Intr 
Init 

>3046855 

len = 

Term 
Intr 
Init 

>3046855 

len = 

Init 
Intr 
Intr 
Term 

>3046855 

len = 

Init 
Term 



52045 52191 
52278 52358 
52709 52948 

/1440 

743 nex = 

61068 60893 
61285 61159 
61635 61504 

/151892 

970 nex = 

61285 60730 
61692 61504 

/20686 

67 0 nex = 

61068 61024 
61285 61159 
61692 61504 

79633 

937 nex = 

61068 60759 
61285 61159 
61695 61504 

/107804 

911 nex = 

61068 60785 
61285 61159 
61695 61504 

/21689 

14 65 nex = 

63956 64174 

64250 64483 

64572 64658 

64758 65420 

/92780 

810 nex = 

64590 64658 
64758 65399 



60 >3046855 



/1167Q9 
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Init 
Intr 
Term 



72683 72877 
72984 73124 
73257 73458 



>3046855 

len = 

Term 
Intr 
Intr 
Init 



3759 nex = 

74593 73489 

75227 74671 

76973 76756 

77247 77060 



>3046856 

len = 

Init 
Intr 
Term 



/19567 



10176 10234 
10433 11125 
11211 11636 



>3046856 

len = 

Sngl 

>3046855 

len = 

Sngl 

>3046856 

len = 

Term 
Intr 
Intr 
Intr 
Init 

>3046856 

len = 

Init 
Intr 
Intr 
Intr 
Intr 
Intr 
Term 



20236 19688 



1469 nex = 
41267 41096 



1522 



nex ■ 



40068 39764 

40233 40144 

40498 40331 

40851 40586 

41285 41096 

739378 



2380 

42386 
42533 
42914 
43341 
43540 
43935 
44339 



42447 
42663 
42997 
43451 
43731 
44032 
44765 



>3046856 

60 



725234 



Reference No. 2750-942P 



Init 
Term 

>3046856 

len = 

Term 
Intr 
Init 

>3046856 

len = 

Term 
Intr 
Intr 
Intr 
Intr 
Intr 
Intr 
Intr 
Init 

>3046856 

len = 

Term 
Intr 
Init 

>3047060 

len = 

Init 
Term 

>3047060 

len = 

Sngl 

>3047074 

len = 

Sngl 

>3047074 

len = 



43935 44032 
44339 44773 

/7861 

2553 nex = 

61214 60794 
61683 61434 
63346 61947 

734558 

2860 



65245 
65449 
65643 
65778 
65936 
66094 
66219 
66631 
66928 



nex = 

64782 
65358 
65616 
65739 
65875 
56031 
66181 
66577 
66836 



/33481 

1393 nex = 

74793 73895 
75040 74878 
75287 75126 

/107700 

1095 nex = 



27099 
27822 



27748 
28193 



742538 
675 nex = 
30075 29401 
730174 
518 nex = 
102999 102484 
73550 
2 660 nex = 



Term 104534 103979 
60 Intr 104851 104618 
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Intr 105248 104963 

Intr 105650 105501 

Intr 105885 105812 

Init 106638 106431 



len 

1 0 Term 

Intr 
Intr 
Intr 
Intr 

15 Intr 
Intr 
Intr 
Intr 
Intr 

2 0 Intr 

Init 

>3047074 

25 len = 

Init 
Term 

30 >3047074 
len = 
Sngl 

35 

>3047088 

len = 

40 Init 
Intr 
Intr 
Term 

45 >3047088 
len = 
Term 

50 Intr 
Intr 
Init 

>3047088 

55 

len = 



2304 

77864 
78105 
78294 
78454 
78603 
78757 
78979 
79155 
79339 
79549 
79716 
80060 



77757 
77960 
78189 
78377 
78550 
78697 
78855 
79069 
79266 
79420 
79634 
79919 



/14379 

1310 nex = 

88948 89130 
89599 90257 

/33112 

596 nex = 

89686 90281 

/38281 

1605 nex = 

13957 14138 

14229 14418 

14734 14808 

14903 15561 

/40501 

2311 nex = 

16023 15615 

16385 16228 

16857 16804 

17417 16963 

/20286 

4330 nex = 



Init 76297 76485 
Intr 76590 76926 
60 Intr 77004 77145 



Reference No. 2750-942P 



Intr 
Intr 
Intr 
Intr 
Intr 
Intr 
Intr 
Intr 
Intr 
Intr 
Intr 
Intr 
Term 



77287 
77477 
77737 
78172 
78398 
78763 
79033 
79204 
79374 
79696 
79926 
80061 
80359 



77385 
77533 
77808 
78261 
78457 
78948 
79125 
79285 
79480 
79771 
79972 
80123 
80623 



15 >3047088 



len : 



1594 



nex ■ 



Init 
Intr 
Intr 
Intr 
Intr 
Intr 
Term 



79033 
79204 
79374 
79696 
79926 
80061 
80359 



79125 
79285 
79480 
79771 
79972 
80123 
80623 



>3047100 
len = 



/34671 
3995 nex 



Init 
Intr 
Intr 
Intr 
Intr 
Intr 
Intr 
Intr 
Intr 
Intr 
Intr 
Intr 
Intr 
Intr 
Intr 
Term 



15513 
15672 
15953 
16133 
16307 
16466 
16799 
17024 
17262 
17513 
17785 
17986 
18316 
18618 
19056 
19230 



15554 
15764 
16020 
16208 
16353 
16582 
16876 
17111 
17374 
17575 
17889 
18045 
18417 
18666 
19124 
19507 



>3047100 
len = 



79233 
1124 ne 



Init 
Term 



55 >3047100 



19593 
20386 



19904 
20716 



Term 20944 20738 
60 Init 21458 21292 
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>3047100 
len = 
Sngl 
>3047100 
len = 
Sngl 
>3047100 
len = 



39059 38776 
/545 



Term 
Init 



67746 67593 
68090 67824 



>3047100 
len = 



Term 
Intr 
Init 



67746 67416 
68089 67824 
68512 68419 



>3047100 

len = 

Init 
Intr 
Intr 
Intr 
Intr 
Intr 
Intr 
Intr 
Term 

>3056579 
len = 
Sngl 

>3056579 

len = 

Init 
Intr 
Intr 
Intr 
Intr 
Intr 
Term 



3207 nex ■■ 



78779 
79083 
79317 
79542 
79739 
79921 
80200 
80464 
80658 



78999 
79218 
79450 
79661 
79824 
80109 
80352 
80577 
81268 



/158942 
831 nex = 
32993 33823 
/13461 
3450 nex = 



42933 
44426 
44764 
44977 
45292 
45624 
45872 



44316 
44536 
44900 
45187 
45529 
45774 
46382 
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>3056579 

len = 

Term 
Intr 
Init 

>3059018 

len = 

Term 
Init 

>3059018 

len = 

Init 
Intr 
Intr 
Term 

>3059018 

len = 

Sngl 

>3059018 

len = 

Sngl 

>3059018 

len = 

Init 
Intr 
Term 

>3063438 
len = 
Sngl 

>3063438 

len = 

Init 
Intr 
Intr 
Term 



/38645 

1408 nex = 

62928 62777 
63094 63020 
63938 63662 

/29133 

12 9 8 nex = 

19437 19300 
19744 19558 

/20592 

2679 nex = 

70884 70974 

72180 72269 

72361 72445 

72984 73562 

/38430 

1379 nex = 

72180 72270 

738689 

610 nex = 

82744 83198 

/18947 

17 32 nex = 

87586 88234 
88340 88417 
88957 89317 

/7188 

861 nex = 

106806 107666 

/32337 

1841 nex = 

10816 11218 

11470 11646 

11732 11896 

11987 12656 



60 >3063438 



/20276 
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Term 108108 107703 

5 Intr 108949 108833 

Init 109114 109038 

>3063438 733493 

10 len = 2171 nex = 



Init 
Intr 
Intr 

15 Intr 
Intr 
Intr 
Term 

20 >3063438 



125603 
125795 
126029 
126303 
126429 
126676 
126954 



125698 
125896 
126124 
126334 
126504 
126729 
127179 



len = 

Init 
Intr 
Intr 
Intr 
Intr 
Intr 
Term 



2133 

125603 
125795 
126029 
126303 
126429 
126676 
126954 



nex = 

125698 
125896 
126124 
126334 
126504 
126729 
127173 



2136 



Init 
Intr 
Intr 
Intr 
Intr 
Intr 
Term 



125603 
125795 
126029 
126303 
126429 
126676 
126954 



125698 
125896 
126124 
126334 
126504 
126729 
127176 



1210 nex = 



Term 13932 13866 
Init 14277 14113 



len = 

Term 
Intr 
Intr 
Intr 
Intr 
Intr 



47185 
47500 
47724 
47999 
48148 
48315 



nex = 

47114 
47405 
47599 
47942 
48093 
48238 
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Intr 


48516 48412 




0 




Intr 


48701 48602 




0 




Init 


49201 49002 


_ 


0 


5 


>3063438 


/43026 








len = 


1060 nex = 


1 






Sngl 


68860 67801 




0 


10 












>3053438 


/27805 








len = 


1272 nex = 


1 




15 


Sngl 


8642 7563 


_ 


0 




>3063690 


/40949 








len = 


2193 nex = 


3 




20 












Init 


18843 19345 


+ 


0 




Intr 


19886 19966 


+ 


0 






20222 21035 


+ 


0 


25 


>3063690 


/18482 








len = 


3096 nex = 


5 






Term 


21537 21211 




0 


30 


Intr 


21947 21729 


- 


0 




Intr 


22174 22060 




0 




Intr 


22529 22282 




0 




Init 


24306 22854 


- 


0 


35 


>3063690 


/35221 








len = 


182 9 nex = 


2 






Init 


51969 52232 


+ 


0 


40 




52962 53797 


+ 


0 




>3063690 


739535 








len = 


1673 nex = 


3 




45 












Init 


89115 89369 


+ 


0 




Intr 


89531 89856 


+ 


0 




Term 


90448 90787 


+ 


0 


50 


>3063690 


/28609 








len = 


1510 nex = 


3 






Term 


94514 94452 




0 


55 


Intr 


95366 95253 




0 




Init 


95700 95452 




0 



60 len = 



1352 nex = 
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Term 
Intr 
Intr 
Init 



9605 9217 

9840 9694 

10174 9918 

10568 10350 



>3068702 
len = 



738567 



1554 



Term 
Intr 
Intr 
Init 



9605 9176 

9840 9694 

10174 9918 

10729 10350 



76926 



Term 
Init 



12518 
13158 



12287 
12549 



>3068702 
len = 



73844 
3670 ne 



Term 
Intr 
Intr 
Intr 
Intr 
Intr 
Intr 
Intr 
Intr 
Intr 
Intr 
Init 



28154 
28332 
28985 
29153 
29837 
30000 
30183 
30416 
30579 
30782 
30970 
31445 



27776 
28240 
28814 
29077 
29728 
29941 
30110 
30293 
30502 
30676 
30866 
31360 



Term 
Intr 
Init 



52373 51897 
52618 52465 
53311 52956 



>3075383 
len = 



737427 
2556 nex 



Init 
Intr 
Term 



47142 47616 
48151 48776 
48851 49697 



>3075383 
len = 



74752 
562 nex 



60 Sngl 



49196 49757 



0 
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>3075383 

len = 

Init 
Intr 
Term 



733535 

1150 nex = 

51795 51949 
52409 52543 
52636 52942 



len = 

Term 
Intr 
Intr 
Intr 
Intr 
Intr 
Intr 
Init 

>3G80352 

len = 

Term 
Intr 
Intr 
Intr 
Intr 
Intr 
Intr 
Init 



2558 nex ■■ 



53383 
53656 
53916 
54085 
54632 
54787 
54992 
55322 



53089 
53563 
53743 
53994 
54523 
54712 
54900 
55106 



1590 nex 



40284 
40461 
40639 
40838 
40960 
41121 
41295 
41687 



40098 
40369 
40558 
40768 
40928 
41092 
41227 
41583 



len = 

Term 
Intr 
Intr 
Intr 
Intr 
Intr 
Intr 
Init 

>3080352 
len = 
Sngl 

>3080352 
len = 



1694 nex 



40284 
40461 
40639 
40838 
40960 
41121 
41295 
41730 



40037 
40369 
40558 
40768 
40928 
41092 
41227 
41583 



739492 
1192 nex = 
56521 57712 
7119432 
107 6 nex = 



Term 59135 58928 
60 Intr 59332 59218 
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Intr 
Init 



59524 59415 
50003 59857 



3652 



nex ■ 



Term 
Intr 
Intr 
Intr 
Intr 
Intr 
Intr 
Intr 
Intr 
Intr 
Intr 
Init 



60315 
60539 
60747 
60927 
61080 
61235 
61529 
61701 
61918 
62170 
62546 
62781 



60098 
60394 
60642 
60850 
61027 
61175 
61405 
61615 
61839 
62041 
62464 
62669 



2126 



nex ■ 



Term 
Intr 
Intr 
Intr 
Intr 
Intr 
Intr 
Intr 
Intr 
Init 



65646 
65897 
66105 
66302 
66445 
66599 
66819 
67006 
67165 
67356 



65231 
65752 
66000 
66225 
66392 
66539 
66695 
66920 
67086 
67258 



736325 
3216 nex = 



50 



Term 
Intr 
Intr 
Intr 
Intr 
Intr 
Intr 
Intr 
Intr 
Intr 
Intr 
Init 



65646 
65897 
66105 
66302 
66445 
66599 
66819 
67006 
67165 
67387 
67607 
67822 



65229 
65752 
66000 
66225 
66392 
66539 
66695 
66920 
67086 
67258 
67525 
67707 



>3080352 
55 len = 

Sngl 



681 nex = 
77159 76479 



>3080406 

60 



732793 
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Init 


3323 


3465 




0 


Intr 


3592 


3780 


+ 


0 


Intr 


3945 


4107 


+ 


0 


Intr 


4211 


4403 


+ 


0 


Term 


4497 


4800 




0 



len = 

Term 
Intr 

15 Intr 
Intr 
Intr 
Intr 
Intr 

2 0 Intr 
Intr 
Intr 
Init 

25 >3080406 
len = 



2965 nex 



28758 
29006 
29136 
29290 
29583 
29930 
30171 
30408 
31062 
31203 
31485 



1493 



28521 
28912 
29088 
29231 
29462 
29882 
30100 
30308 
30951 
31134 
31423 



nex 



Term 

3 0 Intr 

Intr 
Intr 
Init 

35 >3080406 
len = 
Init 

4 0 Intr 

Intr 
Term 

>3080406 



5013 4742 

5195 5109 

5342 5307 

5651 5451 

5811 5725 

/12228 

2116 nex = 

62410 63011 

63132 63371 

63788 63866 

63967 64525 

/2912 



Init 
Intr 
Intr 
Intr 
Intr 
Intr 
Term 



2158 

65401 
65941 
66212 
66528 
66715 
66903 
67158 



nex = 

65852 
66039 
66406 
66620 
66810 
67043 
67558 



>3080406 
len = 



/1368 
1665 nex ■ 



60 



Term 



84894 84458 



0 
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Intr 
Intr 
Intr 
Intr 
Init 



85064 84994 

85285 85188 

85426 85354 

85611 85567 

86122 85694 



>3080430 

Sngl 
>3080430 



/14423 



len ■ 



Term 
Intr 
Intr 
Intr 
Intr 
Init 



903 
1133 
1450 
1804 
2094 
2524 



736 
998 
1388 
1758 
1882 
2295 



>3080430 
len = 



Init 
Term 



36053 
36245 



36158 
36811 



>3080430 

len = 

Term 
Intr 
Intr 
Intr 
Init 



1829 nex = 

38834 38436 

39547 39329 

39701 39626 

39938 39797 

40254 40174 



>3080430 
len = 



/1978 



2536 



nex ■■ 



Term 
Intr 
Intr 
Intr 
Intr 
Intr 
Intr 
Intr 
Init 

>3080430 

len = 



72981 
73256 
73521 
73732 
74041 
74377 
74609 
74761 
75104 



72569 
73068 
73345 
73617 
73815 
74130 
74474 
74693 
74843 



/6464 
1702 nex 



Term 79588 79377 
60 Init 81078 80640 
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>3080430 
len = 
Sngl 

>3108024 

len = 

Term 
Init 

>3108024 



Term 
Intr 
Intr 
Intr 
Intr 
Intr 
Init 

>3108024 

len = 

Term 
Intr 
Intr 
Intr 
Intr 
Intr 
Init 

>3108025 

len = 

Init 
Intr 
Intr 
Term 

>3108025 

len = 

Term 
Intr 
Init 

>3128134 

len = 

Init 



9916 10049 



29167 28743 

30389 29273 

/124122 

1576 nex = 

43805 43635 

43963 43896 

44150 44048 

44473 44243 

44946 44906 

45067 45031 

45200 45177 

/2164 

1615 nex = 

43805 43610 

43963 43896 

44150 44048 

44473 44243 

44946 44906 

45067 45031 

45224 45177 

734936 

2501 nex = 

101875 102049 

102345 102824 

102969 103448 

103539 104375 

/113281 

1612 nex = 

104623 104358 

105265 104996 

105449 105341 

/33790 

22 67 nex = 

16068 16558 
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Intr 
Intr 
Term 

>3128134 

len = 

Term 
Intr 
Intr 
Intr 
Intr 
Init 

>3128134 

len = 

Init 
Term 

>3128134 
len = 
Sngl 

>3128135 

len = 

Init 
Term 



17365 17675 
17756 17820 
17936 18334 

737644 

2174 nex = 

19620 18952 

19787 19706 

20141 20058 

20421 20245 

20669 20507 

21125 20761 

/42970 

1721 nex = 

22577 22849 
22987 24297 

/39030 

619 nex = 

6536 7154 

/33791 

1822 nex = 

21696 22080 
22757 23517 



>3128135 

len = 

Term 
Intr 
Intr 
Init 

>3128135 

len = 

Term 
Intr 
Intr 
Intr 
Init 

>3128135 

len = 



/25162 

1581 nex = 

47134 46795 

47507 47403 

47751 47624 

48188 48109 

/39130 

1611 nex = 

47134 46806 

47507 47403 

47751 47624 

48188 48109 

48416 48288 

/18718 

1690 nex = 



Init 48812 49068 
60 Intr 49281 49599 
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>3128135 
5 len = 



50104 50162 
/110653 



1658 



nex 



Init 
Intr 
Intr 
Intr 
Term 



53406 53626 

53710 53762 

53845 54012 

54431 54538 

54771 55063 



>3128135 
15 len = 



/1455 



2771 



Term 
Intr 
Intr 
Intr 
Intr 
Intr 
Intr 
Intr 
Intr 
Intr 
Init 



55477 
55672 
55818 
56007 
56234 
56410 
56574 
56855 
56992 
57191 
57772 



55365 
55601 
55768 
55920 
56111 
56338 
56511 
56662 
56943 
57128 
57284 



3834 



nex = 



Init 
Intr 
Intr 
Intr 
Intr 
Intr 
Intr 
Intr 
Intr 
Intr 
Intr 
Intr 
Intr 
Intr 
Term 



58058 
58941 
59123 
59534 
59679 
59861 
60097 
60287 
60449 
60620 
60816 
61011 
61154 
61370 
61621 



58788 
59002 
59223 
59584 
59753 
59938 
60153 
60338 
60519 
60719 
60920 
61073 
61276 
61531 
61891 



/235 
1547 nex 



Init 
Intr 
Term 



62091 62463 
62667 63009 
63093 63637 



len = 

60 



738 nex = 



1 
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Sngl 
>3128136 
len = 



1 738 
/3080 
1342 nex = 



Init 
Intr 
Intr 
Term 



23530 24049 

24161 24386 

24474 24547 

24640 24871 



/2222 



1300 nex 



Init 
Intr 
Intr 
Term 



23572 24049 

24161 24386 

24474 24547 

24640 24871 



/38030 



2477 



Term 
Intr 
Intr 
Intr 
Init 



28271 27313 

28564 28362 

28798 28676 

29081 28887 

29789 29424 



>3128136 
len = 
Sngl 

>3128136 
len = 



/93374 



550 



nex ■ 



36684 36143 
/34700 
19 01 nex = 



Term 
Intr 
Intr 
Intr 
Init 



39171 38646 

39414 39263 

39628 39503 

39879 39736 

40546 40298 

/35310 



len = 

Term 
Intr 
Intr 
Intr 
Intr 
Intr 
Intr 
Intr 
Init 



1990 nex 



42802 
42973 
43171 
43366 
43569 
43782 
43925 
44136 
44318 



42663 
42892 
43064 
43269 
43471 
43645 
43874 
44023 
44220 



Reference No. 2750-942P 



>3128136 /15350 

len = 3109 nex = 

Term 45887 45746 

Intr 46034 45984 

Intr 46201 46130 

Intr 46384 46316 

Intr 46520 46479 

Intr 46680 46612 

Intr 46901 46764 

Intr 47104 46994 

Intr 47275 47188 

Intr 47451 47396 

Init 48367 47549 

>3128136 /37218 

len = 1584 nex = 

Init 57574 57902 

Intr 58199 58492 

Term 58579 59157 

>3128136 /149202 

len = 1438 nex = 

Init 57626 57902 

Term 58579 59063 

>3128136 /31524 

len = 750 nex = 

Sngl 61551 62300 

>3128136 /2152 

len = 681 nex = 

Sngl 63097 63777 

>3128136 /2489 



len = 1832 nex = 



Init 


8722 


8908 


Intr 


9213 


9287 


Intr 


9396 


9443 


Intr 


9532 


9688 


Intr 


9777 


9849 


Intr 


10088 


10185 


Term 


10266 


10553 



>3128137 /1311 

len = 670 nex = 



Sngl 10381 10111 



Reference No. 2750-942P 



>3128137 

len = 

Sngl 

>3128137 

len = 

Sngl 

>3128137 

len = 

Init 
Intr 
Intr 
Intr 
Intr 
Term 

>3128137 

len = 

Term 
Intr 
Intr 
Init 



Init 
Intr 
Intr 
Intr 
Term 



/2518 
509 nex = 
13914 13406 
/2561 
6 39 nex = 
17617 16979 
735979 
1795 nex = 



29554 
30223 
30392 
30505 
30713 
30962 



29782 
30304 
30424 
30626 
30874 
31348 



/25262 

2145 nex = 

35368 34765 

35549 35466 

35707 35623 

36012 35792 

728455 

1527 nex = 



37172 
37545 
37818 
38129 
38466 



37397 
37633 
37947 
38158 
38698 



>3128137 

len = 

Term 
Init 

>3128138 

len = 

Term 
Intr 
Intr 
Init 



/9946 

719 nex = 

39259 39103 
39475 39347 

/14654 

953 nex = 

33774 33527 

33934 33895 

34248 34172 

34479 34388 



60 >3128139 



/6095 



Reference No. 2750-942P 



len = 
Sngl 

5 

>3128139 

len = 

1 0 Term 
Intr 
Init 

>3128139 

15 

len = 

Init 
Intr 

2 0 Intr 
Term 

>3128139 

25 len = 

Sngl 

>3128139 

30 

len = 
>3128139 
35 len = 

Sngl 
>3128139 

40 

len = 
Sngl 

45 >3128139 
len = 
Init 

5 0 Intr 
Intr 
Intr 
Intr 
Intr 

55 Intr 
Intr 
Intr 
Term 



203 nex = 

34265 34183 

79938 

769 nex = 

33664 33501 
34085 33948 
34269 34183 

/114411 

835 nex = 

36332 36381 

36467 36596 

36706 36764 

36863 37166 

/100570 

408 nex = 

42429 42022 

/30782 

1108 nex = 

/14992 

1135 nex = 

42487 42041 

/117908 

101 nex = 

53867 53767 

736495 

3037 nex = 

56082 56250 

57096 57195 

57270 57367 

57458 57546 

57631 57691 

57783 57947 

58040 58096 

58194 58249 

58361 58496 

58593 59118 



60 >3128139 7119783 



Reference No. 2750-942P 



1306 nex 



Term 
Init 

>3128139 

len = 

Init 
Intr 
Term 



60197 59560 

60865 60344 

/2640 

1630 nex = 

78927 79261 

79585 79788 

80124 80547 



15 >3128139 
len = 
Term 

2 0 Intr 
Init 

>3128140 

2 5 len = 

Term 
Intr 
Intr 

30 Intr 
Init 

>3128140 

3 5 len = 

Init 
Intr 
Intr 

4 0 Intr 

Intr 
Intr 
Intr 
Intr 

4 5 Intr 

Term 

>3128141 

5 0 len = 

Sngl 
>3128141 
len = 
Sngl 



55 



1119 nex = 

8221 7893 

8425 8288 

9011 8793 



1631 



nex 



36206 35989 

36361 36296 

36935 36804 

37208 37034 

37619 37369 

74372 



2792 

42434 
42817 
42997 
43129 
43321 
43491 
43701 
43849 
44187 
44328 



42735 
42920 
43045 
43235 
43408 
43619 
43775 
44100 
44247 
44635 



/11386 
6 47 nex = 
27315 26669 
/218 
610 nex = 
28622 28020 



60 >3128141 /41397 



Reference No. 2750-942P 



len = 1330 nex = 



Sngl 38776 37454 

5 

>3128141 /155962 



len = 618 nex = 



10 Sngl 5415 4798 



>3128141 /31445 



Init 58824 59151 
Term 60191 60698 



>3128142 12 

20 

len = 1810 



Term 
Intr 
Intr 
Intr 
Intr 
Intr 
Intr 
Init 



18523 
18697 
18917 
19057 
19247 
19419 
19665 
20185 



18377 
18620 
18816 
19001 
19152 
19348 
19525 
19759 



>3128142 /20783 



len = 2186 nex = 

35 

Init 43584 43663 

Intr 43850 44097 

Intr 44207 44257 

Intr 44381 44536 

40 Intr 44649 44805 

Intr 44887 45029 

Intr 45105 45140 

Term 45266 45769 



45 >3128142 
len = 
Term 

50 Init 
>3128142 
len = 

55 

Term 
Intr 
Init 



/116956 

1195 nex = 

49243 48803 
49997 49329 

/10051 

1117 nex = 

49243 48881 
49488 49329 
49997 49638 



60 >3128142 /40335 



Reference No. 2750-942P 







398 


nex = 


1 






Sngl 


50607 


50210 


_ 


0 


5 














>3128142 


/31032 








len = 


3178 


nex = 


9 




10 


Init 


55780 


56016 


+ 


0 




Intr 


56983 


57164 




0 




Intr 


57277 


57357 


+ 


0 




Intr 


57442 


57501 








Intr 


57586 


57663 


+ 


0 


15 


Intr 


57752 


57815 




0 




Intr 


57898 


58169 




0 




Intr 


58258 


58485 


+ 


0 




Term 


58560 


58957 


+ 


0 


20 


>3128142 


/16740 








len = 


1431 


nex = 


2 






Init 


59192 


59602 


+ 


0 


25 


Term 


60287 


60622 


+ 


0 




>3128142 


/30437 








len = 


1315 


nex = 


2 




3 0 














Init 


81339 


81632 


+ 


0 




Term 


82066 


82653 


+ 


0 




>3128142 


/8294 






35 














Len = 


1150 


nex = 


2 






Term 


84674 


84011 


_ 


0 




Init 


85146 


84757 


_ 


0 


40 














>3128143 


/39401 








len = 


3813 


nex = 


11 




45 


Init 


16973 


17402 


+ 


0 




Intr 


17496 


17719 


+ 


0 




Intr 


17842 


18089 


+ 


0 




Intr 


18179 


18398 




0 




Intr 


18712 


18839 


+ 


0 


50 


Intr 


19099 


19355 


+ 


0 




Intr 


19458 


19565 


+ 


0 




Intr 


19668 


19874 




0 




Intr 


19978 


20088 


+ 


0 




Intr 


20176 


20349 


+ 


0 


55 


Term 


20425 


20785 


+ 


0 




>3128143 


/24169 








len = 


1371 


nex = 


2 





Reference No. 2750-942P 



Term 21558 21040 

Init 22410 21729 

>3128143 722388 

5 

len = 1134 nex = 

Init 23440 23527 

Intr 23989 24077 

10 Term 24347 24573 

>3128143 /97304 

len = 1618 nex = 

15 

Init 34719 34756 

Intr 35037 35222 

Intr 35369 35408 

Term 35792 36036 

20 

>3128143 /17995 

len = 2075 nex = 

25 Init 38456 38783 

Intr 38871 39113 

Term 39198 40530 

>3128143 /5688 

30 

len = 518 nex = 

Sngl 43484 44001 

35 >3128143 /41898 

len = 416 nex = 

Sngl 43585 44000 

40 

>3128143 /11837 

len = 2078 nex = 

45 Term 44515 44311 

Intr 44724 44603 

Intr 44933 44823 

Intr 45130 45020 

Intr 45250 45213 

50 intr 45393 45323 

Intr 46203 46049 

Init 46388 46280 

>3128143 76495 

55 

len = 2273 nex = 

Term 44515 44170 

Intr 44724 44603 

60 Intr 44933 44823 



Reference No. 2750-942P 



Intr 
Intr 
Intr 
Intr 
5 Init 

>3128143 

len = 

10 

Term 
Init 

>3128143 

15 

len = 

Term 
Intr 

2 0 Intr 
Init 

>3128143 

2 5 len = 

Sngl 
>3128143 

30 

len = 

Init 
Intr 

3 5 Intr 

Intr 
Term 

>3128143 

40 

len = 

Init 
Intr 

45 Term 
>3128166 
len = 

50 

Init 
Intr 
Intr 
Intr 

55 Intr 
Intr 
Term 

>3128166 

60 



1110 



45130 45020 - 0 

45250 45213 - 0 

45393 45323 - 0 

46203 46049 - 0 

46442 46280 - 0 

/43069 

1223 nex = 2 

47252 46474 - 0 

47696 47350 - 0 

/31005 

1417 nex = 4 

48228 47939 - 0 

48546 48312 - 0 

48718 48648 - 0 

49355 49116 - 0 

/4502 

1283 nex = 1 

54087 54642 + 0 
/3836 

1672 nex = 5 

57472 57668 + 0 

57754 57814 + 0 

58106 58210 + 0 

58389 58672 + 0 

58870 59134 + 0 

/5915 

56 3 nex = 3 

78978 79080 + 0 

79169 79306 + 0 

79392 79540 + 0 

734876 

2304 nex = 7 

104985 105384 + 0 

105472 105604 + 0 

105754 105891 + 0 

105992 106125 + 0 

106247 106372 + 0 

106603 106706 + 0 

107114 107288 + 0 



/8402 



Reference No. 2750-942P 



Init 107501 107707 

Intr 107864 107980 

Intr 108265 108299 

Intr 108441 108541 

Term 108639 108914 

28166 /9014 

len = 2503 nex = 



Init 
Intr 
Intr 
Intr 
Intr 
Intr 
Intr 
Term 

>3128166 

len = 

Term 
Init 



len = 

Term 
Intr 

3 5 Intr 

Intr 
Intr 
Intr 
Intr 

4 0 Intr 

Intr 
Init 

>3128166 



109037 
109367 
109771 
110290 
110541 
110780 
110930 
111101 



50 



109232 
109543 
109860 
110433 
110708 
110849 
111011 
111260 



23650 23349 
23856 23745 



/18124 
2278 nex = 



23650 
23856 
24003 
24158 
24328 
24527 
24742 
24972 
25156 
25591 



Term 
Init 



>3128166 
len = 
55 Sngl 
>3128166 
len = 

60 



23314 
23745 
23937 
24112 
24263 
24415 
24623 
24830 
25042 
25463 



705 nex = 

31097 30635 
31334 31265 

/26701 

5 99 nex = 

34617 35215 

/16528 

4456 nex = 



Reference No. 2750-942P 



Init 


42061 


42128 


+ 


0 


intr 


42230 


42292 


+ 


0 


Intr 


42439 


42480 


+ 


0 


Intr 


42662 


42727 


+ 


0 


Intr 


42820 


42885 


+ 


0 


Intr 


43008 


43121 


+ 


0 


Intr 


43230 


43293 


+ 


0 


Intr 


43471 


43553 


+ 


0 


Intr 


43682 


43731 


+ 


0 


Intr 


43811 


43888 


+ 


0 


Intr 


44033 


44161 


+ 


0 


Intr 


44296 


44338 


+ 


0 


Intr 


44430 


44516 


+ 


0 


Intr 


44684 


44785 


+ 


0 


Intr 


45134 


45241 


+ 


0 


Intr 


45333 


45383 


+ 


0 


Intr 


45531 


45604 


+ 


0 


Intr 


45682 


45733 


+ 


0 


Term 


45831 


46160 


+ 


0 



2 5 Sngl 



474 nex = 
47811 47358 
723768 
970 nex = 
/21043 





len = 


2437 


nex = 


9 


35 


Init 


55878 


56041 


+ 




Intr 


56251 


56433 


+ 




Intr 


56538 


56602 


+ 




Intr 


56676 


56775 


+ 




Intr 


56882 


56920 


+ 


40 


Intr 


57009 


57191 


+ 




Intr 


57460 


57638 


+ 




Intr 


57783 


57902 


+ 




Terra 


57996 


58314 


+ 


45 


>3128166 


/39057 






len = 


1992 


nex = 


7 




Init 


6113 


6363 


+ 


50 


Intr 


6454 


6540 


+ 




Intr 


6632 


6895 


+ 




Intr 


6985 


7149 


+ 




Intr 


7228 


7281 


+ 




Intr 


7374 


7520 


+ 


55 


Term 


7616 


8104 


+ 




>3132469 


/14874 






len = 


2445 


nex = 


6 



Reference No. 2750-942P 



Term 
Intr 
Intr 
Intr 
Intr 
Init 

>3133272 

len = 

Sngl 

>3133272 

len = 

Sngl 

>3135250 

len = 

Sngl 

>3135250 

len = 

Init 
Term 

>3150395 
len = 
Sngl 

>3150395 

len = 

Term 
Intr 
Intr 
Intr 
Init 

>3150395 

len = 

Term 
Intr 
Intr 
Intr 
Init 



4589 4297 

4838 4692 

5021 4916 

5351 5155 

5552 5460 

6741 5705 

/15372 

59 6 nex = 

15821 16416 

/105334 

190 nex = 

47085 47265 

/99763 

598 nex = 

34775 34178 

737787 

1140 nex = 

40584 40900 
41358 41723 

73754 

708 nex = 

12833 13540 

737527 

19 3 8 nex = 



19470 
19706 
20319 
20522 
20953 



19016 
19545 
19790 
20409 
20632 



734995 

2070 nex = 

21605 21212 

21821 21733 

22028 21935 

22801 22629 

23281 23141 



>3150395 

60 



7106011 



Reference No. 2750-942P 



Term 


21605 


21183 


0 


Intr 


21821 


21733 


0 


Intr 


22028 


21935 


0 


Intr 


22801 


22629 


0 


Init 


23286 


23141 


0 



>3150395 /22034 

len = 1351 nex = 

Term 23931 23690 

Intr 24424 24224 

Init 25040 24615 







1297 


nex = 


3 


20 












Init 


75585 


75743 


+ 






75875 


75961 


+ 




Term 


76230 


76442 


+ 


25 


>3150396 


/299 






len = 


2574 


nex = 


10 




Init 


20107 


20448 


+ 


30 


Intr 


20697 


20782 


+ 




Intr 


20860 


20926 


+ 




Intr 


21087 


21229 


+ 




Intr 


21397 


21445 


+ 




Intr 


21591 


21728 


+ 


35 


Intr 


21813 


21893 


+ 




Intr 


21979 


22155 


+ 




Intr 


22242 


22328 






Term 


22416 


22680 


+ 


40 


>3150396 


737777 






len = 


1499 


nex = 


7 




Init 


21157 


21229 


+ 


45 


Intr 


21397 


21445 






Intr 


21591 


21728 


+ 




Intr 


21813 


21893 






Intr 


21979 


22155 


+ 




Intr 


22242 


22328 


+ 


50 


Term 


22416 


22655 


+ 




>3150396 


/18023 






len = 


550 


nex = 


1 


55 












Sngl 


29680 


30224 


+ 



6 0 len = 



1095 nex = 



2 



Reference No. 2750-942P 



Term 
Init 

>3150396 

len = 

Init 
Term 

>3150396 

len = 

Term 
intr 
Intr 
Init 

>3150396 
len = 
Sngl 

>3152602 

len = 

Init 
Term 

>3152602 

len = 

Sngl 

>3152602 

len = 

Sngl 

>3152602 

len = 

Sngl 

>3152602 

len = 

Sngl 

>3152602 

len = 



36375 35916 
37010 36739 

74576 

154 3 nex = 

65489 66226 
66584 67031 

/29581 

1572 nex = 

77303 77124 

78071 77946 

78321 78170 

78695 78445 

736577 

2177 nex = 

80119 79013 

733637 

1336 nex = 

12908 13530 
13557 14243 

717402 

1296 nex = 

16191 17124 

72328 

651 nex = 

68201 67551 

736311 

1291 nex = 

7655 8945 

720125 

69 0 nex = 

76571 77260 

7104289 

760 nex = 



Reference No. 2750-942P 



Sngl 

>3169169 

len = 

Init 
Intr 
Intr 
Intr 
Intr 
Intr 
Intr 
Intr 
Intr 
Term 



42 93 nex 



10438 
10800 
11012 
11168 
11602 
12378 
12552 
12727 
14190 
14407 



10717 
10945 
11088 
11290 
12013 
12464 
12620 
12798 
14303 
14730 



4175 nex 



Init 
Intr 
Intr 
Intr 
Intr 
Intr 
Intr 
Intr 
Intr 
Intr 
Intr 
Term 



10565 
10800 
11012 
11168 
11602 
11912 
12378 
12552 
12727 
13609 
14190 
14559 



10717 
10945 
11088 
11290 
11704 
12013 
12464 
12620 
12798 
13730 
14303 
14739 



3 649 nex ■- 



40 Init 
Intr 
Intr 
Intr 
Intr 

4 5 Intr 

Intr 
Intr 
Intr 
Intr 

5 0 Intr 

Intr 
Intr 
Term 

55 >3169169 
len = 



23773 
24056 
24279 
24441 
24831 
25023 
25156 
25439 
25617 
25899 
26085 
26875 
27084 
27222 



23969 
24201 
24355 
24560 
24933 
25060 
25257 
25525 
25700 
25970 
26203 
26988 
27133 
27421 



/38084 
229 6 nex = 



init 58707 58874 
60 Intr 59120 59249 



Reference No. 2750-942P 



Intr 


59331 


59525 


+ 


0 


Intr 


59614 


59771 


+ 


0 


Intr 


59925 


60015 


+ 


0 


Intr 


60148 


60213 


+ 


0 


Intr 


60303 


60385 


+ 


0 


Intr 


60504 


60567 


+ 


0 


Intr 


60639 


60707 


+ 


0 


Term 


60845 


61002 


+ 


0 



>3169169 

len = 

Term 
Intr 
Init 

>3169169 

len = 



Term 
Intr 
Init 

>3169169 

len = 

Sngl 

>3169169 

len = 

Sngl 

>3169169 

len = 

Sngl 

>3169169 

len = 

Sngl 

>3169169 

len = 

Sngl 

>3169169 

len = 

Sngl 



/24000 

850 nex = 

61337 61068 
61652 61612 
61908 61760 

/14468 

953 nex = 

61337 61019 
61652 61612 
61971 61760 

727437 

407 nex = 

73602 74008 
/206407 

478 nex = 

73603 74080 
79937 

478 nex = 

73604 74081 
728423 

507 nex = 
73604 74110 
726935 
530 nex = 
73604 74133 
711891 
532 nex = 
73604 74135 



Reference No. 2750-942P 



>3169169 

len = 

Sngl 

>3172156 

len = 

Sngl 

>3172156 

len = 

Init 
Intr 
Intr 
Term 

>3172156 

len = 

Init 
Intr 
Intr 
Term 

>3172156 

len = 

Init 
Intr 
Intr 
Intr 
Term 

>3172156 

len = 

Init 
Intr 
Intr 
Intr 
Term 

>3172156 
len = 

>3172155 
len = 
Init 



/121073 

47 8 nex = 

73656 74133 

726867 

7 90 nex = 

3280 4062 

77894 

2323 nex = 

45242 46579 

46725 46788 

47076 47119 

47209 47564 

739526 

1772 nex = 

45807 46579 

46725 46788 

47076 47119 

47209 47578 

76711 

1810 nex = 



48322 
48765 
48909 
49460 
49686 



48537 
48805 
49207 
49606 
50127 



7120133 
1781 nex 



48347 
48765 
48909 
49460 
49686 



48537 
48805 
49207 
49606 
50127 



7114146 
2516 nex = 

76399 
2508 nex = 
63063 63494 



Reference No. 2750-942P 



rntr 63615 64173 
Term 64493 64821 



len = 817 nex = 

Term 74213 74001 

Intr 74366 74320 

Intr 74576 74517 

Init 74817 ' 74669 



>3172156 

len = 

Term 
Intr 
Intr 
Init 

>3172156 

len = 



/25408 



957 



nex ■■ 



74213 74000 

74366 74320 

74576 74517 

74956 74669 

/94230 



Term 
Init 



7070 6731 
7680 7442 



>3176693 
len = 



Init 
Term 



20568 20774 
20878 21151 



>3176693 
len = 



/36051 
1870 nex 



Init 
Intr 
Intr 
Intr 
Intr 



31001 
31211 
31406 
31618 
31848 
32037 



31132 
31309 
31522 
31713 
31960 
32294 



>3176693 
len = 



5624 



nex 



Term 
Intr 
Intr 
Intr 
Intr 
Intr 
Intr 
Intr 
Intr 
Init 



36200 
36523 
37268 
37641 
38211 
38769 
39653 
39839 
40136 
40542 



35812 
36290 
36993 
37365 
37913 
38584 
39444 
39776 
39990 
40305 



Reference No. 2750-942P 



>3176693 

len = 

Term 
Init 

>3176694 
len = 
Sngl 

>3176694 

len = 

Term 
Intr 
Init 

>3176694 
len = 
Sngl 

>3176695 

len = 

Term 
Intr 
Intr 
Intr 
Intr 
Intr 
Init 

>3176695 

len = 

Init 
Intr 
Term 

>3176695 
len = 
Sngl 

>3176695 
len = 



/12338 

1346 nex = 

8005 7949 
8926 8665 

/10539 

704 nex = 

11287 11990 

/206065 

1371 nex = 

19165 18447 
19610 19409 
19817 19698 

73965 

356 nex = 

29025 28693 

/39581 

3411 nex = 

8155 7742 

8402 8251 

8648 8490 

9014 8762 

9217 9101 

9356 9294 

11152 10747 

/25260 

897 nex = 

26015 26131 
26251 26367 
26507 26911 

/34123 

215 nex = 

38577 38363 

728545 

889 nex = 



Term 44351 43984 
60 Init 44872 44663 



Reference No. 2750-942P 



>3176695 /32366 



len = 1810 nex = 

Init 45733 45892 

Intr 46308 46477 

Intr 46559 46750 

Intr 46829 47178 

Term 47280 47536 



>3176695 /18G04 



len = 119 nex = 

Sngl 75443 75561 
>3176701 737338 



2170 nex ■ 



Init 
Intr 
Intr 
Intr 
Intr 
Term 



23063 
23800 
24237 
24402 
24573 
24845 



23245 
24134 
24313 
24462 
24717 
25227 



>3176701 /39441 

30 

len = 1243 nex = 

Init 84984 85415 

Intr 85710 85873 

35 Term 86105 86226 

>3176701 /38836 

len = 490 nex = 

40 

Init 85708 85873 

Term 86105 86195 

>3184270 726825 

45 

len = 1709 nex = 

Term 46892 46301 

Init 48009 47436 

50 

>3184270 /13775 



len = 2935 nex = 



Init 


52068 


52252 


Intr 


52654 


52722 


Intr 


53313 


53438 


Intr 


53646 


53728 


Intr 


53832 


53907 


Intr 


53982 


54158 



Reference No. 2750-942P 



Intr 54241 54369 
rntr 54444 54582 
Term 54778 55002 



Sngl 77713 77315 



2294 



Init 
Intr 
Intr 
Intr 
Intr 
Intr 
Intr 
Intr 
Intr 
Intr 
Term 



80046 
80300 
80466 
80725 
80900 
81099 
81270 
81422 
81567 
81699 
81859 



80143 
80371 
80521 
80804 
80982 
81176 
81338 
81490 
81613 
81789 
82119 



19 6 9 nex ■■ 



Init 
Intr 
Intr 
Intr 
Intr 
Intr 
Intr 
Intr 
Intr 
Term 



80300 
80466 
80725 
80900 
81099 
81270 
81422 
81567 
81699 
81859 



80371 
80521 
80804 
80982 
81176 
81338 
81490 
81613 
81789 
82035 



1915 



Term 
Intr 
Intr 
Init 



16811 16369 

17513 17355 

17971 17940 

18283 18065 



/20206 



len 



2050 nex = 



Init 
Intr 
Intr 
Intr 
Intr 
Term 



35088 
35751 
36023 
36287 
36457 
36865 



35203 
35937 
36179 
36414 
36643 
37128 



Reference No. 2750-942P 



>3193282 

len = 

Init 
Term 

>3193282 

len = 

Term 
Intr 
Intr 
Intr 
Intr 
Intr 
Intr 
Intr 
Intr 
Init 



36528 36643 
36865 37141 



2692 

39121 
40307 
40438 
40596 
40783 
40918 
41132 
41298 
41506 
41770 



nex = 

39088 
40078 
40381 
40547 
40726 
40872 
40997 
41224 
41441 
41692 



Init 
Intr 
Intr 
Intr 
Intr 
Intr 
Term 

>3193282 

len = 

Init 
Term 

>3193282 

len = 

Sngl 

>3193282 

len = 

Sngl 

>3193282 

len = 



1738 

51343 
51738 
51986 
52242 
52446 
52655 
52894 



nex = 

51536 
51895 
52106 
52366 
52559 
52822 
53080 



/23194 

717 nex = 

63968 64230 
64331 64684 

/17524 

7 07 nex = 

68554 67848 

/19543 

869 nex = 

68626 67758 

/15281 

1873 nex = 



Term 73831 73399 
60 Intr 74170 73932 



Reference No. 2750-942P 



Init 
>3193305 
len = 



75271 74778 
/40560 
2320 nex = 



Term 
Intr 
Intr 
Intr 
Intr 
Init 



10008 
10234 
10745 
10952 
11528 
11941 



9622 
10083 
10346 
10836 
11466 
11625 



>3193305 
len = 
Sngl 
>3193305 



/29200 
98 6 nex = 
30156 29171 
/18253 



len 



670 



nex 



Sngl 
>3193311 



2718 nex 



Init 
Intr 
Intr 
Intr 
Intr 
Intr 
Intr 
Intr 
Intr 
Term 



27978 
28407 
28610 
28784 
29273 
29489 
29673 
29932 
30249 
30371 



28299 
28512 
28670 
29064 
29412 
29563 
29763 
30135 
30293 
30695 



len = 

4 5 Init 
Intr 
Intr 
Term 

50 >3193311 



Init 
Intr 
Intr 
Term 



1211 nex = 

30962 31055 

31136 31208 

31291 31429 

31513 31699 

/12983 

2068 nex = 

57272 57337 

57712 58588 

58663 58797 

58889 59339 



>3193311 

60 



/32470 



Reference No. 2750-942P 



len = 

Sngl 

>3193311 

len = 

Term 
Intr 
Intr 
Intr 
Intr 
Intr 
Init 



610 nex = 
63745 63139 
733482 



1692 

64707 
64872 
65002 
65233 
65434 
65755 
66044 



nex = 

64353 
64795 
64961 
65072 
65319 
65682 
65896 



>3193311 

len = 

Term 
Intr 
Intr 
Intr 
Init 

>3193311 

len = 

Term 
Intr 
Intr 
Intr 
Intr 
Init 

>3193311 

len = 

Init 
Term 

>3201608 

len = 

Init 
Intr 
Term 

>3201608 

len = 

Init 
Term 



2350 nex = 

66929 66852 

67385 67310 

67569 67521 

68289 68130 

68762 68559 



/13263 



2770 



nex 



66929 66852 

67385 67310 

67569 67521 

68289 68130 

68729 68559 

69246 68838 

/21877 

1450 nex = 

8494 8695 
8783 9939 

/30206 

916 nex = 

13538 13733 
13827 13875 
14061 14453 

736378 

33 7 nex = 

13539 13733 
13827 13875 



60 >3201608 



738967 



Reference No. 2750-942P 



1123 nex ■■ 



Init 
Intr 
Term 

>3201608 

len = 

Sngl 

>3201608 

len = 

Sngl 

>3201608 

len = 

Init 
Intr 
Intr 
Term 

>3201608 

len = 

Init 
Intr 
Term 

>3201608 
len = 
Sngl 

>3201608 

len = 

Init 
Intr 
Intr 
Intr 
Intr 
Intr 
Intr 
Intr 
Intr 
Term 



15910 16091 
16178 16226 
16396 17032 

/142926 

44 6 nex = 

15916 16359 

733579 

430 nex = 

16612 17032 

/7866 

2272 nex = 

22124 22360 

22689 22844 

23414 23562 

23794 24395 

/125631 

956 nex = 

36898 37143 
37222 37407 
37534 37853 

/118068 

680 nex = 

52418 52003 

/34360 

2916 nex = 

53417 53661 

53871 53956 

54282 54488 

54612 54717 

54802 54906 

54980 55176 

55376 55535 

55661 55756 

55848 55942 

56072 56332 



/18932 



60 len = 



3 685 nex = 



12 



Reference No. 2750-942P 



Term 


56943 


56652 


_ 


0 


intr 


57107 


57006 


_ 


0 


Intr 


57319 


57236 


_ 


0 


Intr 


57540 


57445 


_ 


0 


Intr 


57666 


57616 




0 


Intr 


57901 


57761 




0 


Intr 


58146 


58090 




0 


Intr 


58297 


58223 




0 


Intr 


58658 


58386 




0 


Intr 


58800 


58741 




0 


Intr 


59106 


59014 




0 


Init 


60336 


60218 




0 



15 >3201608 



3770 



Term 
Intr 
Intr 
Intr 
Intr 
Intr 
Intr 
Intr 
Intr 
Intr 
Init 



57107 
57319 
57540 
57666 
57901 
58146 
58297 
58658 
58800 
59106 
60381 



57006 
57236 
57445 
57616 
57761 
58090 
58223 
58386 
58741 
59014 
60218 



>3201608 
len = 



/35095 
2050 nex ■■ 



Term 
Intr 
Intr 
Intr 
Intr 
Intr 
Intr 
Intr 
Init 



83717 
83885 
84083 
84260 
84508 
84761 
84937 
85083 
85526 



83480 
83817 
83991 
84169 
84428 
84666 
84841 
85034 
85162 



>3212I02 
len = 



17 24 nex 



Init 
Intr 
Intr 
Intr 
Intr 
Intr 
Intr 
Term 



20146 
20513 
20801 
20991 
21233 
21394 
21563 
21735 



20243 
20640 
20899 
21149 
21319 
21485 
21637 
21869 



60 len = 



930 nex = 



2 



Reference No. 2750-942P 



Term 
Init 

>3212102 

len = 

Term 
Init 

>3212102 

len = 

Init 
Term 

>3212102 

len = 

Sngl 

>3212102 

len = 

Sngl 

>3212846 

len = 

Term 
Intr 
Intr 
Intr 
Intr 
Init 

>3212846 

len = 

Term 
Intr 
Intr 
Intr 
Intr 
Intr 
Intr 
Init 

>3212846 

len = 



23160 22726 
23655 23565 

/7088 

1376 nex = 

23160 22971 
24346 23565 

/42528 

1189 nex = 

54373 54915 
55009 55561 

734748 

631 nex = 

71980 71350 

725849 

437 nex = 

71980 71544 

730161 

1396 nex = 

100365 100258 

100551 100455 

100842 100748 

101023 100927 

101352 101107 

101653 101420 

724370 

2551 nex = 



14963 
15363 
15522 
15716 
15974 
16145 
16807 
17292 



14742 
15285 
15476 
15617 
15874 
16059 
16677 
17119 



736599 
598 nex 



Term 16807 16695 
60 Init 17292 17119 



Reference No. 2750-942P 



>3212846 
len = 



/106170 
163 9 nex 



Term 


19624 


19275 


0 


Intr 


19851 


19773 


0 


Intr 


20034 


19988 


0 


Intr 


20229 


20130 


0 


Intr 


20408 


20308 


0 


Init 


20607 


20521 


0 



>3212846 

15 len = 

Term 
Intr 
Intr 

20 Intr 
Intr 
Init 

>3212846 



Term 
Intr 
Intr 
Intr 
Intr 
Intr 
Init 



2304 nex = 

28397 28158 

28537 28479 

28784 28635 

28981 28865 

29343 29149 

30461 30043 



/482 



1849 



nex 



36128 35534 

36364 36203 

36609 36541 

36747 35700 

36942 36840 

37120 37041 

37271 37200 

/36111 





len = 


1817 


nex = 


8 


40 


Term 


52320 


52252 






Intr 


52529 


52442 






Intr 


52740 


52631 






Intr 


52934 


52833 






Intr 


53111 


53037 




45 


Intr 


53340 


53207 






Intr 


53778 


53541 






Init 


54068 


53940 






>3212846 


/10293 




50 












len = 


2263 




8 




Term 


52320 


52036 






Intr 


52529 


52442 




55 


Intr 


52740 


52631 






Intr 


52934 


52833 






Intr 


53111 


53037 






Intr 


53340 


53207 






Intr 


53778 


53541 




60 


Init 


54153 


53940 





Reference No. 2750-942P 



1882 



nex 



Init 
Intr 
Intr 
Intr 
Intr 
Intr 
Intr 
Term 



54528 
54809 
55061 
55233 
55472 
55791 
55936 
56146 



54721 
54924 
55156 
55328 
55550 
55837 
55995 
56409 



15 >3212846 



Sngl 58250 58411 



Init 
Term 



57184 58160 
58250 58415 



>3212846 
30 len = 



Init 
Intr 
Term 



58759 58825 
59065 59223 
60088 60794 





len = 


2058 


nex = 


3 


40 


Init 


58759 


58825 


+ 




Intr 


59065 


59223 


+ 




Term 


60088 


60804 






>3212846 


729536 




45 












len = 


3420 


nex = 


13 




Init 


71959 


72157 


+ 




Intr 


72507 


72569 


+ 


50 


Intr 


72658 


72741 






Intr 


72831 


72938 






Intr 


73127 


73299 


+ 




Intr 


73393 


73504 


+ 




Intr 


73614 


73694 


+ 


55 


Intr 


73795 


73857 


+ 




Intr 


73975 


74076 


+ 




Intr 


74249 


74373 


+ 




Intr 


74619 


74829 


+ 




Intr 


74912 


74986 


+ 


60 


Term 


75078 


75378 


+ 



Reference No. 2750-942P 



>3212846 /4611 

len = 2878 nex = 

5 

Init 72071 72157 

Intr 72507 72569 

Intr 72658 72741 

Intr 72831 72938 

10 Intr 73127 73299 

Intr 73393 73504 

intr 73614 73694 

Intr 73795 73857 

Intr 73975 74076 

15 Intr 74249 74373 

Term 74619 74829 

>3212846 /36813 

20 len = 1667 nex = 

Term 79812 79309 

Intr 79978 79899 

Intr 80211 80154 

25 Intr 80392 80295 

Init 80975 80496 

>3212846 /17409 

30 len = 1733 nex = 

Term 79812 79361 

Intr 79978 79899 

Intr 80211 80154 

35 intr 80392 80295 

Init 81093 80496 

>3212846 /30978 

40 len = 1706 nex = 

Term 79812 79388 

Intr 80211 79899 

intr 80392 80295 

45 Init 81093 80496 

>3212846 /6950 

len = 1390 nex = 

50 

Sngl 83877 85266 

>3212846 /17908 

55 len = 2271 nex = 

Init 93341 93451 

Intr 93759 93932 

Intr 94012 94112 

60 Intr 94210 94321 



Reference No. 2750-942P 



rntr 
Intr 
Intr 
Term 

>3228389 

len = 

Init 
Term 

>3228389 

len = 

Init 
Term 

>3228389 

len = 



Term 
Init 



94487 94567 

94665 94842 

94936 95270 

95369 95611 

/117479 

127 0 nex = 

25417 26084 
26445 26682 

11221 

1347 nex = 

25417 26084 
26445 26763 

/15453 

12 9 2 nex = 

42250 42036 
42669 42513 



>3228389 

len = 

Term 
Init 

>3228389 

len = 

Init 
Intr 
Intr 
Term 

>3228389 

len = 

Term 
Intr 
Intr 
Intr 
Init 

>3228389 

len = 

Sngl 



/2617 

1296 nex = 

42250 42033 
42669 42513 

742666 

1771 nex = 

45914 46685 

46846 46967 

47100 47351 

47455 47684 

739286 

1930 nex = 

55070 54825 

55329 55246 

55816 55703 

56104 56001 

56745 56456 

718771 

320 nex = 

57320 57001 



>3228389 7749 

60 



Reference No. 2750-942P 



len = 1712 nex = 

Init 59020 59117 

Intr 59214 59250 

5 Intr 59369 59409 

Intr 59889 60119 

Intr 60204 60306 

Intr 60397 60464 

Term 60563 60731 

10 

>3228389 /34212 

len = 1113 nex = 

15 Term 71736 71610 

Init 72722 72553 

>3228389 722723 

20 len = 1056 nex = 

Sngl 72763 72553 

>3228389 73279 

25 

len = 1656 nex = 

Init 76223 76387 

Intr 76478 76588 

30 Intr 76756 76848 

Intr 77250 77399 

Intr 77484 77542 

Term 77536 77878 

35 >3228389 735408 

len = 1760 nex = 

Init 88411 88892 

40 Term 89156 90170 

>3236234 742701 

len = 2562 nex = 

45 

Init 14714 14911 

Intr 15858 15964 

Intr 16057 16189 

Intr 16311 16602 

50 Term 16755 17275 

>3236234 7116968 

len = 826 nex = 

55 

Sngl 20794 21619 

>3236234 718641 

60 len = 814 nex = 



Reference No. 2750-942P 



Sngl 21539 20726 

>3236234 /34743 

5 

len = 1033 nex = 

Term 30515 30312 

Intr 30858 30659 

10 Init 31344 31168 

>3236234 /23114 

len = 109 5 nex = 

15 

Term 30515 30314 

Intr 30858 30659 

Init 31275 31168 

20 >3236234 /19581 

len = 2170 nex = 

Init 35817 35886 

25 Intr 36023 36416 

Intr 36938 37130 

Term 37242 37510 

>3236234 /38101 

30 

len = 1484 nex = 

Init 35822 35886 

Intr 36023 36416 

35 Intr 36478 37130 

Term 37242 37305 

>3236234 /25785 

40 len = 1487 nex = 

init 35822 35886 

Intr 36023 36416 

intr 36938 37130 

45 Term 37242 37308 

>3236234 /13797 

len = 1695 nex = 

50 

Term 51666 51431 

Intr 51905 51782 

Intr 52049 51988 

Intr 52228 52131 

55 Intr 52342 52306 

Intr 52698 52427 

Init 53125 52779 



>3236234 

60 



728529 



Reference No. 2750-942P 



len = 52 9 nex = 

Sngl 87056 87584 

5 >3236479 /10950 

len = 1705 nex = 

Init 28178 28532 

10 Term 29443 29882 

>3236479 /30441 

len = 285 nex = 

15 

Sngl 28227 28511 

>3236479 /143197 

2 0 len = 439 nex = 

Sngl 38868 39306 

>3236479 /32070 

25 

len = 2470 nex = 

Term 89296 89181 

Intr 89764 89651 

30 Intr 89981 89841 

Init 91120 90101 

>3241916 /143222 

35 len = 1330 nex = 

Term 17392 17118 

Intr 17601 17497 

Intr 17779 17689 

40 Intr 17948 17886 

Intr 18263 18214 

Init 18446 18359 

>3241916 /7891 

45 

len = 1510 nex = 

Init 19560 19940 

Term 20202 21066 

50 

>3241916 /105027 

len = 167 nex = 

55 Sngl 42357 42523 

>3241916 /2036 

len = 1660 nex = 

60 



Reference No. 2750-942P 



Term 
Intr 
Intr 
Intr 
Intr 
Intr 
Intr 
Intr 
Init 



42888 
43123 
43297 
43511 
43721 
43846 
43979 
44199 
44330 



42671 
42976 
43247 
43397 
43603 
43802 
43940 
44093 
44290 



>3241916 
len = 



/12974 
1882 nex = 



Term 
Intr 
Intr 
Intr 
Init 



52467 52276 

52920 52744 

53121 53006 

53292 53235 

53637 53395 



>3241917 
len = 



/148899 
1424 nex 



Term 
Init 



>3241917 
len = 



12645 12184 
13607 13335 



/3896 
1486 nex 



Term 
Init 

>3241917 
len = 
Sngl 

>3241917 
len = 



12645 12218 
13703 13335 

/152864 

319 nex = 

13703 13385 

/119748 

794 nex = 



Term 
Init 

>3241917 

len = 

Init 
Intr 
Intr 
Term 



28884 28539 
29332 28954 



1487 

37712 
38331 
38659 
38871 



nex = 

38242 
38575 
38777 
39198 



len = 

60 



14 91 nex = 



5 



Reference No. 2750-942P 



Init 
Intr 
Intr 
Intr 
Term 



Init 
Intr 
Intr 
Intr 
Term 

>3241917 
len = 
Sngl 

>3241917 

len = 

Init 
Intr 
Intr 
Term 

>3241920 

len = 

Sngl 

>3241921 

len = 

Sngl 

>3241921 

len = 

Sngl 

>3241922 

len = 

Init 
Intr 
Intr 
Term 



44373 44517 

44592 44617 

44716 44805 

44885 44973 

45057 45296 

737636 

939 nex = 



44355 
44592 
44716 
44885 
45057 



44517 
44617 
44805 
44973 
45293 



734484 

490 nex = 

51169 51654 

730313 

1235 nex = 

57202 57315 

57401 57469 

57768 57870 

58209 58436 

719080 

1218 nex = 

45667 46884 

742897 

4 75 nex = 

26675 26201 

733053 

555 nex = 

35941 36495 

729124 

1276 nex = 

18356 18395 

18580 18665 

18750 18873 

19391 19631 



>3241922 

60 



734761 



Reference No. 2750-942P 



len = 

Sngl 

>3241922 

len = 

Init 
Intr 
Intr 
Term 

>3241922 

len = 

Term 
Intr 
Intr 
Init 

>3241922 

len = 

Sngl 

>3241922 

len = 

Sngl 

>3241923 

len = 

Sngl 

>3241923 

len = 

Term 
Intr 
Intr 
Intr 
Intr 
Intr 
Intr 
Init 

>3241923 

len = 

Sngl 



42 0 nex = 

288 707 

/13557 

1016 nex = 

47961 48011 

48229 48408 

48587 48661 

48747 48972 

729542 

1845 nex = 

49432 49038 

49649 49523 

50374 50138 

50882 50532 

/100927 

391 nex = 

58311 58701 

734244 

166 nex = 

84414 84579 

740576 

765 nex = 

16439 17203 

78215 

1690 nex = 



21000 
21226 
21370 
21576 
21730 
22067 
22235 
22526 



20843 
21110 
21308 
21479 
21665 
21967 
22163 
22444 



711699 
600 nex = 
24791 24991 



60 >3241923 



77822 



Reference No. 2750-942P 



595 nex ■- 



Term 
Intr 
Init 

>3241923 

len = 

Term 
Init 

>3241923 

len = 

Term 
Init 

>3241923 

len = 

Init 
Intr 
Term 

>3241923 

len = 

Init 
Term 

>3241923 

len = 

Init 
Term 

>3241923 

len = 

Sngl 

>3241923 

len = 

Sngl 

>3241923 

len = 

Init 



26064 25987 
26320 26160 
26559 26419 

736757 

1390 nex ■= 

26064 25685 
26320 26160 

/20036 

734 nex = 

29311 28952 
29685 29399 

/20973 

1537 nex = 

33738 33851 
34586 34851 
34945 35274 

/11691 

1472 nex = 

36556 36821 
36913 37365 

79208 

510 nex = 

36555 36821 
36913 37064 

720822 

715 nex = 

40595 41309 

7114182 

39 9 nex = 

40736 41127 

727649 

1630 nex = 

53569 53872 



Reference No. 2750-942P 



Intr 
Intr 
Intr 
Term 

>3241923 

len = 



54069 54212 

54361 54429 

54523 54701 

54800 55196 

/26907 



Init 


53602 


53872 


+ 


0 


Intr 


54069 


54212 


+ 


0 


Intr 


54361 


54429 


+ 


0 


Intr 


54523 


54701 


+ 


0 


Term 


54800 


55218 


+ 


0 



>3241923 
len = 
Sngl 

>3241923 

len = 

Term 
Intr 
Intr 
Intr 
Intr 
Init 

>3241924 
Len = 

>3241924 
len = 
Sngl 

>3241924 

len = 

Term 
Intr 
Intr 
Intr 
Intr 
Intr 
Intr 
Intr 
Intr 
Init 



/31173 



nex - 



440 

60193 59754 
74349 



1995 

8413 
8649 
8936 
9218 
9456 
9940 



nex = 

7946 
8527 
8739 
9120 
9304 
9737 



/4012 
407 nex = 

/2034 
57 4 nex = 
2980 3553 

/41458 
2372 nex = 



32689 
32948 
33091 
33270 
33426 
33636 
33778 
33948 
34113 
34286 



32396 
32773 
33036 
33226 
33360 
33525 
33734 
33876 
34039 
34212 



60 len = 



1734 nex = 



5 



Reference No. 2750-942P 



Term 


36042 


35649 


Intr 


36432 


36137 


Intr 


36674 


36516 


Intr 


37042 


36742 


Init 


37382 


37126 



>3241924 /31303 

10 len = 1072 nex = 

Init 45856 45947 

Intr 46181 46247 

Term 46336 46715 

15 

>3241924 /14377 

len = 768 nex = 

20 Term 51633 51312 

Intr 51804 51711 

Init 52079 51889 

>3241924 738528 

25 

len = 1902 nex = 



30 



Term 


51633 


51293 


Intr 


51804 


51711 


Intr 


52142 


51889 


Intr 


52355 


52232 


Intr 


52541 


52448 


Init 


53194 


53036 



35 >3241924 77358 

len = 49 0 nex = 

Sngl 58044 58525 

40 

>3241924 7106913 

len = 1215 nex = 

45 Init 77459 77620 

Term 78380 78673 

>3241925 710394 

50 len = 2076 nex = 

Term 49380 48998 

Intr 49603 49472 

Intr 49732 49678 

55 Intr 50653 50482 

Intr 50890 50743 

Init 51073 50979 

>3241926 725388 

60 



Reference No. 2750-942P 



init 12191 12401 

Term 12491 12805 

5 

>3241926 /41509 

len = 1870 nex = 

10 Init 13521 13588 

Intr 13687 14214 

Intr 14299 14486 

Term 14580 14889 

15 >3241926 /13875 

len = 1717 nex = 



Init 

2 0 Intr 
Intr 
Intr 
Term 



13173 13425 

13521 13588 

13687 14214 

14299 14486 

14580 14889 



25 >3241926 
len = 



/18612 



2833 



nex 



Term 

3 0 Intr 
Intr 
Intr 
Intr 
Intr 

35 Init 
>3241926 
len = 

40 

Init 
Intr 
Intr 
Intr 

45 Intr 
Term 

>3241926 

50 len = 

Term 
Intr 
Intr 

55 Intr 
Init 



15300 14853 

15460 15363 

15616 15557 

15814 15695 

16062 15916 

16256 16155 

17421 16839 

/8268 

1345 nex = 

6075 6204 

6527 6560 

6653 6825 

6914 6965 

7061 7128 

7215 7419 

/206563 

1570 nex = 

74328 73909 

74617 74464 

74837 74697 

74994 74923 

75192 75080 



>3241927 



74309 



60 len = 



757 nex = 



0 



Reference No. 2750-942P 



>3241927 
len = 

5 

>3241939 

len = 

10 ini-t 
Intr 
Term 

>3241939 

15 

len = 

Init 
Term 

20 

>3241939 

len = 

2 5 Term 
Intr 
Init 

>3242700 

30 

len = 
Sngl 

35 >3242700 
len = 
Term 

40 Init 
>3242700 
len = 

45 

Term 
Intr 
Init 

50 >3242970 
len = 
Init 

55 Intr 
Intr 
Intr 
Term 



732995 
2052 nex = 

727423 

1306 nex = 

2469 2639 
2957 3312 
3400 3774 

731388 

970 nex = 

2957 3312 
3400 3649 

741130 

1254 nex = 

28798 28404 
29363 29256 
29657 29438 

7120446 

358 nex = 

37693 38050 

725463 

1390 nex = 

93971 93636 
95019 94055 

726006 

1391 nex = 

93971 93636 
94189 94055 
95026 94632 

734126 

1718 nex = 

61864 62009 

62392 62498 

62588 62788 

63185 63249 

63334 63581 



60 >3242970 



799825 



Reference No. 2750-942P 



len = 
Init 

5 Intr 
Intr 
Intr 
Term 

10 >3243214 
len = 
Term 

15 Intr 
Init 

>3249094 

20 len = 

Sngl 

>3249094 

25 

len = 
Sngl 

30 >3249094 
len = 
Init 

3 5 Intr 

Intr 
Intr 
Intr 
Intr 

40 Intr 
Intr 
Intr 
Intr 
Intr 

4 5 Intr 

Term 

>3249094 

5 0 len = 

Sngl 
>3249094 

55 

len = 

Term 
Intr 

60 Init 



1815 nex = 

61865 62009 

62392 62498 

62588 62788 

63185 63249 

63334 63679 

724559 

1330 nex = 

22739 22559 

23241 23208 

23516 23386 

/9228 

951 nex = 

12849 11899 

/37124 

1184 nex = 

17368 16185 

/27210 

36 9 6 nex = 

25520 25687 

26293 26400 

26495 26687 

26785 27043 

27217 27321 

27408 27494 

27577 27634 

28142 28209 

28295 28390 

28484 28531 

28693 28722 

28807 28923 

29021 29215 

/10388 

587 nex = 

30050 30636 

/1854 

79 2 nex = 

36628 36312 

36865 36734 

37103 36974 



Reference No. 2750-942P 



>3249094 /31022 

len = 1314 nex = 

5 

init 51260 51655 

Intr 51724 51842 

Term 51917 52573 

10 >3249094 /110171 

len = 2 77 0 nex = 

Term 53358 53234 

15 Intr 53539 53476 

Intr 53856 53789 

Intr 54097 53939 

Intr 54429 54377 

Intr 54692 54595 

20 Intr 54861 54794 

Init 55039 54958 

>3249094 77559 

25 len = 2785 nex = 

Init 56298 56517 

Intr 56641 56706 

Intr 56793 56867 

30 Intr 56947 57049 

Intr 57135 57181 

Intr 57289 57346 

Intr 57470 57519 

Intr 57663 57719 

35 intr 57793 57905 

Intr 58003 58080 

Term 59023 59082 

>3250673 /3164 

40 

len = 734 nex = 

Sngl 17027 17760 

45 >3250673 /17437 

len = 1613 nex = 

Init 83877 84182 

50 Intr 84570 84691 

Term 84930 85489 

>3250673 734882 

55 len = 1175 nex = 

Term 94732 94604 

Intr 95043 94819 

Init 95309 95132 

60 



Reference No. 2750-942P 



>3252804 

len = 

Term 
Intr 
Intr 
Intr 
Intr 
Intr 
Intr 
Intr 
Intr 
Init 

>3252804 

len = 

Term 
Intr 
Intr 
Intr 
Intr 
Intr 
Init 

>3269280 

len = 

Term 
Init 

>3269280 

len = 

Term 
Init 

>3269280 

len = 

Init 
Term 

>3269280 
len = 
Sngl 

>3269280 
len = 



4015 

17729 
18030 
18354 
18716 
19017 
19896 
20112 
20703 
21077 
21338 



17324 
17844 
18243 
18462 
18946 
19802 
19995 
20628 
20991 
21278 



/14216 
1608 



72843 
73012 
73182 
73389 
73624 
73849 
74216 



nex = 

72609 
72966 
73098 
73271 
73475 
73709 
73929 



/15604 

971 nex = 

13730 13454 
14424 14146 

7927 

1073 nex = 

13730 13356 
14428 14146 

72944 

1694 nex = 

15972 16829 
17015 17665 

/13072 

67 0 nex = 

40249 39588 

/17584 

594 nex = 



Sngl 40252 39659 

60 



0 



Reference No. 2750-942P 



>3269280 



/10313 
2310 nex 



Term 
Init 



54740 54478 
56787 55952 



>3269280 
len = 



196 9 nex = 



Term 
Init 



67347 66835 
68803 68154 



>3269280 
len = 



1890 nex 



Init 
Intr 
Intr 
Intr 
Term 



77267 77439 

77520 77574 

78008 78078 

78194 78302 

78818 79156 



>3269280 
len = 



/40387 



Init 
Term 



81954 82059 
82190 82846 



>3281847 
len = 



Term 
Init 



4 7 6 nex 



10602 10258 
10733 10682 



>3281847 
len = 



2313 



nex ■■ 



Init 
Intr 
Intr 
Intr 
Intr 
Term 

>3281847 

len = 

Init 
Intr 
Intr 
Intr 
Term 



31587 31715 

31812 31876 

32278 32340 

32526 32577 

32672 32722 

33024 33293 

/34408 

1487 nex = 

39703 39769 

40039 40096 

40195 40303 

40413 40685 

40780 41189 



60 >3281847 



733868 



Reference No. 2750-942P 



Init 
Intr 
Term 

>3281847 

len = 

Term 
Intr 
Intr 
Init 

>3281847 

len = 

Term 
Intr 
Init 

>3281847 

len = 

Term 
Intr 
Intr 
Intr 
Init 

>3282170 

len = 

Init 
Intr 
Intr 
Intr 
Term 

>3282170 

len = 

Term 
Init 

>3282170 
len = 
Sngl 

>3282170 
len = 



40194 40303 
40413 40685 
40780 41200 

/40692 

1740 



2811 
3511 
3838 
4124 



nex = 

2385 
3125 
3776 
3963 



/96391 

771 nex = 

79299 78987 
79557 79423 
79757 79637 

/25581 

1595 nex = 

79299 79Q54 

79557 79423 

79889 79637 

80152 80064 

80356 80256 

/19175 

139 8 nex = 

110143 110193 

110394 110449 

110536 110588 

110832 110950 

111100 111270 

/2843 

6 56 nex = 

112059 111629 
112284 112232 

/16387 
1588 nex = 
43553 41966 

/39661 
2445 nex = 



Reference No. 2750-942P 



Term 
Intr 
Intr 
Intr 
Intr 
Intr 
Init 

>3282170 

len = 

Sngl 

>3282170 

len = 

Sngl 

>3282170 

len = 

Sngl 

>3282170 

len = 

Sngl 

>3292807 

len = 

Term 
Init 

>3292807 

len = 

Init 
Intr 
Intr 
Intr 
Intr 
Intr 
Intr 
Intr 
Term 

>3292807 

len = 



59522 59409 

59658 59614 

59899 59756 

60368 60000 

61028 60447 

61250 61122 

61853 61594 

/30230 

87 0 nex = 

70042 70911 

/32125 

1584 nex = 

84969 86552 

72939 

310 nex = 

86228 86537 

725426 

458 nex = 

88205 88662 

742959 

1450 nex = 

18522 17812 
19126 18607 

713264 

2439 



20030 
20826 
21037 
21226 
21422 
21563 
21733 
21975 
22302 



nex = 

20399 
20938 
21133 
21326 
21478 
21639 
21895 
22072 
22468 



721391 
2384 nex 



Init 20085 20399 
60 Intr 20826 20938 



Reference No. 2750-942P 













0 




Intr 


21226 


21326 


+ 


0 




Intr 


21422 


21478 


+ 


0 




Intr 


21563 


21639 


+ 


0 






21733 


21895 








Intr 


21975 


22072 


+ 


0 






22302 


22468 








>jzyzoU / 


/95912 






10 
















347 


nex = 








Sngl 


22302 


22507 


+ 


0 


15 


>3292807 


/29391 








len = 


1588 


nex = 


4 






Term 


30143 


29734 


- 


0 


2 0 




30387 


30220 








Intr 


30755 


30582 


_ 


0 




Init 


31321 


30873 


- 


0 




>3292807 


/8228 




















len = 


625 


nex = 


2 






Term 


73592 


73242 


_ 


0 




Init 


73866 


73675 


- 


0 


30 














>3292807 


725628 








len = 


2590 


nex = 


6 




35 


Term 


73592 


73176 


- 


0 




Intr 


73782 


73675 








Intr 


74304 


74203 


_ 


0 




Intr 


74936 


74735 


- 


0 




Intr 


75046 


75018 


- 


0 


40 


Init 


75763 


75408 


- 


0 




>3293581 


/13181 










1475 


nex = 






45 














Term 


69408 


68947 








Intr 


69691 


69496 


_ 


0 




Init 


70421 


70034 


- 


0 






/1461 










1570 


nex = 








Term 


69408 


68922 




0 


55 


Intr 


69691 


69496 




0 




Init 


70489 


70034 




0 




>3293581 


739258 






60 


len = 


1590 


nex = 


3 





Reference 


No. 2750-942P 












1151 


Term 


69408 68907 


- 


0 


intr 


69691 69496 




0 


Init 


70496 70034 


- 


0 


>3293582 


/97031 






len = 


39 0 nex = 


1 




Sngl 


22292 21903 


_ 


0 


>3293582 


/14258 






len - 


3176 nex = 


7 




Term 


50181 49892 




0 


Intr 


50364 50280 




0 


Intr 


51074 50997 




0 


Intr 


51309 51162 




0 


Intr 


51674 51515 


- 


0 


Intr 


52551 52516 




0 


Init 


53067 53029 


- 


0 


>3293583 


/111377 






len = 


1211 nex = 


1 




Sngl 


46281 46069 


- 


0 


>3293583 


/1463 






len - 


1510 nex = 


4 




Term 


51608 51036 




0 


Intr 


51864 51769 


- 


0 


Intr 


52214 51938 




0 


Init 


52536 52350 


: 


0 


>3293583 


/112432 






len - 


1849 nex = 


3 




Term 


55125 55072 


- 


0 


Intr 


55543 55514 




0 


Init 


55756 55620 


- 


0 


>3297806 


724629 








971 


2 




Term 


18135 18036 




0 


Init 


18729 18586 


_ 


0 


>3297806 


/21867 






len = 


1979 nex = 


7 





Term 41785 41666 
Intr 41937 41872 
60 Intr 42102 42007 



Reference No. 2750-942P 



Intr 
Intr 
Intr 
Init 



42340 42238 

42580 42423 

42933 42811 

43304 43034 



>3297806 
len = 



/2306 



1972 



Term 
Intr 
Intr 
Intr 
Intr 
Intr 
Init 



41785 
41937 
42102 
42340 
42580 
42933 
43304 



41666 
41872 
42007 
42238 
42423 
42811 
43034 



>3297806 

len = 

Init 
Intr 
Intr 
Term 



2093 nex 



45275 
45924 
46640 
46967 



45667 
46300 
46876 
47367 



>3297806 

len = 

Init 
Intr 
Intr 
Intr 
Term 

>3297806 

len = 



988 

47770 
47971 
48108 
48295 
48496 



1719 



nex = 

47885 
47992 
48213 
48402 
48757 



Init 
Intr 
Intr 
Term 



50596 50900 

51195 51391 

51826 51877 

51988 52314 

/8161 



Term 
Intr 
Intr 
Init 



2257 nex = 

67615 67335 

67811 67699 

68621 68463 

69591 69324 



55 >3297806 



/9341 
79 0 nex 



Term 72925 72662 
60 Init 73446 73018 



Reference No. 2750-942P 



>3297806 

len = 

Term 
Intr 
Intr 
Intr 
Intr 
Intr 
Intr 
Init 



2658 

83301 
83518 
83942 
84248 
84531 
84770 
85083 
85457 



nex = 

82800 
83381 
83742 
84171 
84352 
84621 
84883 
85176 



>3298532 
len = 
Sngl 
>3298532 



1830 nex ■ 



Term 
Intr 
Intr 
Intr 
Init 



26390 26016 

26666 26475 

26876 26747 

27100 26958 

27845 27756 



/630 



35 Init 
Intr 
Intr 
Term 

40 >3298532 



45179 45777 

45865 46206 

46287 46361 

46938 47266 



Init 
Intr 
Term 



48070 48390 
49014 49037 
49150 49670 



>3298532 

len = 

Term 
Intr 
Intr 
Intr 
Intr 
Intr 
Init 



/28303 
2115 nex 



54352 
54548 
54758 
54914 
55095 
55479 
56024 



53910 
54474 
54684 
54849 
55019 
55310 
55735 



60 >3298532 



/41121 



Reference No. 2750-942P 



len = 
Sngl 
>3299824 
len = 

Term 

Intr 
Intr 
Intr 
Init 

>3299824 

len = 

Init 
Intr 
Intr 
Intr 
Term 

>3299824 

len = 

Init 
Term 

>3299824 

len = 

Init 
Term 

>3299824 

len = 

Init 
Term 

>3299824 

len = 

Init 
Intr 
Intr 
Term 

>3299824 

len = 



454 nex = 

64301 64754 

729869 

2330 nex = 

16577 16462 

17378 17142 

17531 17470 

18024 17899 

18270 18221 

/20753 

1870 nex = 

18439 18529 

18632 18674 

19090 19131 

19213 19441 

19805 20301 

/15442 

617 nex = 

25851 25868 
25952 26467 

7711 

1732 nex = 

35976 36226 
37044 37707 

7113135 

1721 nex = 

35987 36226 
37044 37707 

717748 

1376 nex = 

48724 48965 

49089 49189 

49289 49482 

49567 50099 

793014 

960 nex = 



Init 



73060 73121 



Reference No. 2750-942P 





73213 


73463 


+ 


Term 


73906 


74019 


+ 


>3299824 


/111207 




len = 


1604 


nex = 


3 


Init 


97176 


97361 


+ 




97704 


97837 


+ 


Term 


97933 


98779 


+ 


>3299824 


737727 




len = 


1410 


nex = 


2 




99110 


99786 


+ 


Term 


99950 


100519 


+ 


>3309259 


/40330 




len = 


1966 


nex = 


3 


Init 


85348 


85495 


+ 




85653 


85944 


+ 


arm 


86040 


87313 




>3309259 


/27711 




len = 


1630 


nex = 


4 


Term 


89111 


88858 


_ 


Intr 


89328 


89210 


- 


Intr 


89667 


89615 




Ini 


89802 


89747 






/119712 




len = 


1705 




4 


Term 


89111 


88841 




Intr 


89328 


89210 


- 


Intr 


89667 


89615 




Init 


89802 


89747 


- 


>3309259 


/118778 




len = 


987 


nex = 


1 


ng 


93145 


92159 






/7149 




len 


1455 


nex = 


6 


Term 


49451 


49232 




Intr 


49635 


49550 




Intr 


49804 


49734 




Intr 


50152 


50051 




Intr 


50352 


50254 




Init 


50686 


50577 





Reference No. 2750-942P 







1231 


nex = 


1 






ng 


1451 


2681 




0 




>3319339 


/34540 








len = 


1813 


nex = 


5 




10 
















45077 


45206 


+ 


0 






45305 


45474 




0 






45506 


45885 




0 






45999 


46165 


+ 


0 


15 


Term 


46268 


46560 




0 




>3319339 


/32443 








len = 


2033 




5 


















Term 


3121 


2602 




0 




Intr 


3336 


3204 




0 




Intr 


3716 


3488 




0 




Intr 


4312 


4280 






25 


Init 


4634 


4405 








>3319339 


/1360 








len = 


1870 


nex = 


7 




3 0 
















52990 


53088 




0 




Intr 


53171 


53228 


+ 


0 




Intr 


53336 


53462 


+ 


0 




Intr 


53548 


53717 




0 


35 


Intr 


53870 


54149 








Intr 


54243 


54409 








Term 


54518 


54850 




0 




>3319339 


/27916 






40 
















1832 


nex = 








Init 


52992 


53088 




0 




Intr 


53171 


53228 


+ 


0 


45 


Intr 


53336 


53462 


+ 


0 




Intr 


53548 


53717 




0 




Intr 


53870 


54149 








Intr 


54243 


54409 




0 




Term 


54518 


54823 




0 


50 














>33 19339 


/27199 








len = 


2013 


nex = 


6 




55 


Init 


70541 


70591 


+ 


0 




Intr 


70680 


70806 


+ 


0 




Intr 


70898 


71067 


+ 


0 




Intr 


71347 


71626 


+ 


0 




Intr 


71727 


71875 


+ 


0 


60 


Term 


71991 


72346 


+ 


0 



Reference No. 2750-942P 



>3319339 

len = 

Term 
Init 

>3319365 

len = 

Term 
rntr 
Intr 
Intr 
Init 

>3319365 

len = 

Sngl 

>3319365 

len = 

Sngl 

>3319365 

len = 

Sngl 

>3327922 

len = 

Term 
Init 

>3327922 

len = 

Init 
Intr 
Intr 
Intr 
Term 

>3327922 

len = 

Sngl 



/19839 

1038 nex = 

86289 85954 
86991 86748 

726983 

1471 nex = 

25198 24914 

25328 25285 

25499 25442 

25917 25824 

26384 26126 

/113133 

1059 nex = 

39553 39787 

/37317 

4 72 nex = 

39061 38590 

733945 

3 70 nex = 

46052 45690 

7101361 

93 8 nex = 

25827 25600 
26537 25924 

738439 

2141 nex = 

32746 32973 

33067 33167 

33246 33289 

33713 34348 

34423 34886 

719349 

591 nex = 

35078 35668 



60 >3327922 



727727 



Reference No. 2750-942P 



1390 nex = 



Init 
Intr 
Intr 
Intr 
Term 

>3327922 
len = 
Sngl 

>3327922 

len = 

Init 
Intr 
Term 

>3327922 

len = 

Init 
Intr 
Intr 
Term 

>3327922 

len = 

Init 
Intr 
Intr 
Term 

>3327922 

len = 

Init 
Intr 
Intr 
Intr 
Intr 
Intr 
Term 

>3327922 

len = 

Sngl 



39098 
39314 
39746 
39971 
40408 



39224 
39394 
39866 
40249 
40487 



46107 46791 
738395 



56836 57097 
58412 58924 
59024 59694 



1065 nex ■ 



69508 
69677 
69891 
70458 



69568 
69792 
69954 
70572 



1489 nex = 

69508 69568 

69677 69792 

69891 69954 

70458 70996 



/18513 



1765 

71210 
71451 
71857 
72008 
72320 
72534 
72732 



nex = 

71356 
71651 
71917 
72209 
72441 
72638 
72974 



/14549 
925 nex = 
80147 81071 



60 >3327922 



729298 



Reference No. 2750-942P 



len = 

Sngl 

>3327922 

len = 

Sngl 

>3327922 

len = 

Sngl 

>3327922 

len = 

Sngl 

>3335331 

len = 

Init 
Intr 
Term 

>3335331 

len = 

Init 
Intr 
Intr 
Intr 
Intr 
Intr 
Intr 
Intr 
Intr 
Intr 
Term 

>3335331 

len = 

Term 
Init 

>3335331 

len = 



831 nex = 

80171 81001 

734656 

801 nex = 

80201 81001 

/39583 

490 nex = 

80576 81064 

/13054 

370 nex = 

80640 80995 

75586 

793 nex = 

17573 17712 
17853 18139 
18220 18365 

732284 

4400 nex = 

2322 2529 

2902 3011 

3108 3218 

3352 3416 

3502 3583 

4342 4416 

4842 4912 

5033 5115 

5687 5751 

5986 6060 

6374 6721 

74326 

953 nex = 

64184 63680 
64632 64297 

76906 

3250 nex = 



Init 7810 8237 
60 Intr 8419 8481 



Reference No. 2750-942P 



Intr 


8573 


8668 


+ 


0 


Intr 


9068 


9115 


+ 


0 


Intr 


9290 


9361 


+ 


0 


Intr 


10005 


10063 


+ 


0 


Intr 


10270 


10360 




0 


Term 


10857 


11050 


+ 


0 



>3335331 /19247 
10 len = 3193 nex 



Init 
Intr 
Intr 
Intr 
Intr 
Intr 
Intr 
Term 



7832 
8419 
8573 
9068 
9290 
10005 
10270 
10857 



8237 
8481 
8668 
9115 
9361 
10063 
10360 
11024 



25 Term 
Init 

>3335331 

30 len = 

Init 
Term 

35 >3335331 
len = 
Sngl 
>3335356 
len = 



40 



45 



Term 
Init 



>3335356 

5 0 len = 

Init 
Intr 
Intr 

55 Intr 
Intr 
Intr 
Intr 
Term 

60 



1125 nex = 

79626 79511 
80234 80142 

742267 

939 nex = 

81876 82398 
82623 82814 

/17230 

1210 nex = 

83926 85126 

723238 

1810 nex = 

99574 98995 
100803 100401 

731586 

4099 nex = 

40725 40991 

41490 41675 

41764 41876 

41973 42156 

42435 42532 

43054 43261 

43348 43605 

44286 44823 



Reference No. 2750-942P 



1161 

>3335356 /4723 





len = 


711 


nex = 


3 




5 


Init 


48610 


48831 








Intr 


48899 


48963 








Term 


49052 


49320 








>3335356 


/15751 






1 0 














len 


4072 


nex = 








Term 


57235 


56932 


_ 


0 




Intr 


58046 


57956 


- 


0 


1 5 




58325 


58155 




^ 






58546 


58430 








Intr 


58799 


58671 


- 


0 




Intr 


59124 


58900 


- 


0 




Intr 


59544 


59348 






20 


Intr 


59678 


59650 










60330 


60263 








Init 


61003 


60621 


_ 


0 




>3335356 


77394 






25 














len = 


1030 










Init 


65862 


66005 


+ 


0 




Term 


66091 


66596 


+ 


0 


30 














>3335356 


725934 








len = 


910 


nex = 


3 




35 


Init 


67764 


68104 


+ 


0 




Intr 


68204 


68310 


+ 


0 




Term 


68403 


68673 


+ 


0 




>3335356 


719998 






4 0 














len = 


1039 


nex = 








Sngl 


79183 


80221 






45 


>3337347 


74598 








len = 


2715 


nex = 


7 






Init 


32576 


32711 


+ 


0 






33094 


33465 








Intr 


33623 


33850 


+ 


0 




Intr 


33937 


34291 


+ 


0 




Intr 


34378 


34502 


+ 


0 




Intr 


34599 


34734 


+ 


0 


55 


Term 


34820 


35288 


+ 


0 




>3337347 


736037 








len = 


1937 


nex = 


6 





Reference No. 2750-942P 



Init 
Intr 
Intr 
Intr 
5 Intr 
Term 

>3337347 

10 len = 

Init 
Intr 
Intr 

15 Intr 
Intr 
Intr 
Intr 
Intr 

2 0 Intr 
Intr 
Intr 
Intr 
Intr 

2 5 Term 



33121 33465 

33623 33850 

33937 34291 

34378 34502 

34599 34734 

34820 35057 

/37499 

2676 nex = 



36337 
36582 
36700 
36846 
37031 
37379 
37592 
37841 
38020 
38160 
38297 
38476 
38603 
38833 



36480 
36614 
36764 
36933 
37103 
37485 
37691 
37917 
38068 
38206 
38367 
38531 
38732 
39012 



Term 
Intr 
Init 

35 >3337347 
len = 
Sngl 

40 

>3337347 
len = 
45 Sngl 
>3341671 
len = 



50 



Term 
Init 



212 9 nex = 

48472 48232 
49760 49222 
50360 50200 

/29146 

702 nex = 

60015 59314 

723727 

979 nex = 

95243 96221 

/105595 

1168 nex = 

32434 32064 
32805 32715 

739499 

2552 nex = 



Term 32434 32032 
Intr 32805 32715 
60 Intr 33301 33190 



Reference No. 2750-942P 



Intr 
Intr 
Init 

>3341671 

len = 



Init 
Term 

>3341671 
len = 
Sngl 

>3341671 

len = 

Init 
Term 

>3341671 

len = 

Init 
Term 

>3341671 

len = 

Term 
Intr 
Intr 
Intr 
Init 

>3341671 
len = 
Sngl 

>3341671 

len = 

Term 
Intr 
Intr 
Init 



33563 33475 
33731 33661 
34583 34277 

/5217 

610 nex = 

39041 39309 
39394 39647 

/30696 

133 nex = 

41075 40943 

/40641 

869 nex = 

45689 45897 
46011 46557 

/111157 

610 nex = 

45812 45897 
46011 46420 

/40953 

1719 nex = 

47540 47148 

47874 47709 

48054 47945 

48358 48140 

48866 48453 

/1136 

314 nex = 

58089 57776 

742237 

1709 nex = 

68153 67782 

68844 68674 

69172 69084 

69484 69280 



/34828 



len = 



1345 nex = 



Reference No. 2750-942P 



Init 72023 72154 

Intr 72231 72327 

Intr 72444 72516 

Intr 72596 72697 

5 Intr 72783 73032 

Term 73118 73367 

>3341671 736996 

10 len = 3070 nex = 



Init 
Intr 
Intr 
Intr 
Intr 

>3341671 
len = 



78677 78786 

78936 79004 

79193 79308 

79630 79750 

80100 80194 

80294 80892 

/114909 

1518 nex = 



Init 
Intr 
Intr 
Intr 
Intr 
Term 

>3341671 

len = 



78682 78786 

78936 79004 

79193 79308 

79376 79445 

79630 79750 

80100 80194 

/19760 



Term 
Init 



83127 82506 
83783 83570 



>3355463 
len = 



/11583 
1192 nex 



Term 
Intr 
Intr 
Init 



19567 19377 

19829 19673 

20349 20214 

20568 20449 



>3355463 



/124576 
894 nex 



Init 
Term 



24459 24491 
24870 25152 



>3355463 
len = 



722479 
1198 nex ■■ 



Init 
Term 



44058 44421 
44510 44770 



60 >3355463 



727485 



Reference No. 2750-942P 



len = 
Sngl 

5 

>3355463 

len = 

10 Init 
Intr 
Intr 
Intr 
Intr 

15 Intr 
Intr 
Term 

>3355463 

20 

len = 

Term 
Intr 

2 5 Intr 
Intr 
Init 

>3355463 

30 

len = 

Term 
Intr 

35 Intr 
Intr 
Intr 
Intr 
Intr 

4 0 Intr 
Init 

>3366536 

4 5 len = 

Term 
Intr 
Intr 

50 Intr 
Intr 
Init 

>3366536 

55 

len = 

Init 
Intr 

6 0 Intr 



115 0 nex = 

43581 44721 

/16403 

1510 nex = 

44997 45084 

45167 45248 

45334 45438 

45532 45605 

45692 45814 

45902 45973 

46058 46154 

46240 46506 

73850 

118 7 nex = 

60749 60655 

61079 60839 

61271 61155 

61421 61359 

61841 61597 

/15933 

2560 nex = 

90887 90794 

91034 90970 

91211 91128 

91605 91540 

91892 91758 

92227 92182 

92490 92384 

93059 93011 

93353 93139 

/27918 

1725 nex = 

26503 26130 

26880 26613 

27116 26996 

27259 27206 

27499 27394 

27854 27748 

732438 

755 nex = 

32527 32606 

32715 32774 

32882 32935 



Reference No. 2750-942P 



Term 33014 33281 

>3366536 75485 

5 len = 1553 nex = 

Term 73632 73175 

Init 74390 73723 

10 >3367500 /11257 

len = 1890 nex = 

Init 11335 11645 

15 Term 11808 12594 

>3367500 /125642 

len = 2198 nex = 

20 

Init 27603 28220 

Intr 28298 28420 

Intr 28512 28628 

Term 29428 29800 

25 

>3367500 /21771 

len = 2129 nex = 

30 Init 27671 28220 

Intr 28298 28420 

Intr 28512 28628 

Term 29428 29799 

35 >3367500 /104934 

len = 1930 nex = 

Init 27736 28220 

40 Intr 28298 28420 

Intr 28512 28628 

Term 29428 29665 

>3367500 /25284 

45 

len = 865 nex = 

Init 27765 28220 

Intr 28298 28420 

50 Term 28512 28629 

>3367500 737435 

len = 2016 nex = 

55 

Term 29954 29748 

Intr 30174 30040 

intr 30390 30253 

Intr 30575 30469 

60 Intr 30799 30669 



Reference No. 2750-942P 



5 



Intr 


30915 


30875 


Intr 


31085 


30985 


Intr 


31254 


31167 


Intr 


31505 


31404 


Init 


31763 


31585 



>3367500 735575 

len = 281 nex = 

10 

Sngl 44549 44829 

>3367567 795636 

15 len = 411 nex = 

Sngl 1 411 

>3367567 7148676 

20 

len = 1341 nex = 

Term 16182 16130 

Intr 16440 16386 

25 Intr 16623 16539 

Init 17470 16947 

>3367567 716204 

3 0 len = 1178 nex = 

Term 31280 31024 

Intr 31447 31367 

Intr 31568 31524 

35 intr 31726 31669 

Intr 31913 31827 

Init 32050 31996 

>3367567 731322 

40 

len = 1172 nex = 

Term 31280 31030 

Intr 31447 31367 

45 intr 31568 31524 

Intr 31726 31669 

Intr 31913 31827 

Init 32050 31996 

50 >3367567 734272 

len = 1514 nex = 

Sngl 48006 49519 

55 

>3367567 74745 

len = 614 nex = 

60 Term 74205 73805 



Reference No. 2750-942P 





inxt 


74418 


74292 








■^O J D / 0 D / 


/17117 






5 


len = 


1216 


nex = 


3 






Term 


84942 


84402 


- 


0 




Intr 


85322 


85058 








Init 


85617 


85409 




0 
















>3367567 


/42841 








len 


2110 


nex = 






15 


Term 


86680 


86417 




0 




Intr 


86891 


86754 




0 




Intr 


87123 


86976 




0 




Intr 


87386 


87274 




0 




Intr 


87739 


87483 


- 


0 


2 0 


Intr 


88305 


88020 








Init 


88518 


88382 








>3367567 


/13699 






25 


len = 


1019 


nex = 


4 






Init 


88880 


89082 










89175 


89371 








Intr 


89462 


89532 


+ 


0 


30 


Term 


89633 


89686 


+ 


0 




>3386593 


/41984 








len = 


562 


nex = 


1 




35 














Sngl 


14092 


14653 


+ 


0 




>3386593 


/82 








40 


len = 


140 




1 






Sngl 


16606 


16467 








>3386593 


/19844 






45 














len = 


1150 


nex = 


3 






Init 


4750 


5187 








Intr 


5328 


5456 






5 0 


Term 


5544 


5890 










/18854 








len = 


1713 


nex = 


4 




55 














Init 


62237 


62493 


+ 


0 




Intr 


63184 


63237 


+ 


0 




Intr 


63333 


63420 


+ 


0 




Term 


63493 


63944 


+ 


0 



Reference No. 2750-942P 



len = 


1213 


nex = 


3 




. 

rnit 


62306 


62493 




0 




63184 


63237 




0 


^^^^ 

erm 


63333 


63420 




0 


>3 3 8 6 593 


/998 






len 


2124 


nex = 


10 




Term 


69215 


69011 


- 


0 


Intr 


69358 


69307 




0 


Intr 


69546 


69457 




0 


Intr 


69727 


69632 




0 


Intr 


69907 


69808 




0 


Intr 


70047 


69992 


- 


0 




70209 


70174 




0 


Intr 


70369 


70293 




0 


Intr 


70834 


70768 


_ 


0 


Init 


71134 


70927 




0 


>3386593 


/13925 






len = 


1676 




3 




Init 


7808 


7937 






Intr 


8704 


9013 






Term 


9133 


9483 






>3386593 


/12250 






len = 


1667 


nex = 


7 




Term 


85697 


85517 


- 


0 


Intr 


86125 


85952 




0 


Intr 


86285 


86211 


- 


0 


Intr 


86455 


86375 






Intr 


86624 


86550 






Intr 


86781 


86707 






Init 


87183 


86857 




. 


>3386593 


/12487 






len = 


1630 


nex = 


6 






85849 


85656 




0 


Intr 


86125 


85952 




0 


Intr 


86285 


86211 




0 


Intr 


86455 


86375 




0 


Intr 


86624 


86550 




0 


Init 


86781 


86707 




0 


>3386593 


/9000 






len = 


2770 


nex = 


13 





Init 87778 87938 
60 Intr 88044 88171 



Reference No. 2750-942P 



Intr 88293 88398 

intr 88487 88568 

Intr 88661 88731 

Intr 88818 88859 

5 Intr 88982 89106 

Intr 89196 89289 

Intr 89407 89482 

Intr 89599 89762 

intr 89840 89945 

10 intr 90121 90207 

Term 90311 90531 

>3395421 /123915 

15 len = 625 nex = 

init 14364 14582 

Term 14746 14988 

20 >3395421 /14329 

len = 879 nex = 

Term 18618 18349 

25 Intr 18781 18712 

Init 19227 18940 

>3395421 /43031 

30 len = 1786 nex = 

Term 40343 39798 

Intr 40877 40709 

Intr 41167 40976 

35 Init 41583 41311 

>3395421 /25615 

len = 833 nex = 

40 

Sngl 50675 51507 

>3395421 /16674 

45 len = 1056 nex = 

Init 9467 9820 

Intr 9908 10031 

Term 10113 10522 

50 

>3399678 /3807 

len = 2443 nex = 

55 Term 9355 8974 

Intr 9578 9453 

Intr 9939 9691 

Intr 10435 10084 

Intr 10680 10541 

60 Init 11416 10972 



Reference No. 2750-942P 



2953 



nex 



15 



Init 
Intr 
Intr 
Intr 
Intr 
Intr 
Intr 
Intr 
Intr 
Intr 
Term 



11842 
12277 
12562 
12772 
12966 
13154 
13590 
13766 
13901 
14081 
14395 



11976 
12410 
12643 
12839 
13067 
13243 
13664 
13798 
13978 
14137 
14794 



>3399678 

20 len = 

Sngl 

>3399678 

len = 

Init 
Intr 
Intr 
Intr 
Term 



25 



1815 nex = 

4154 4420 

4549 4621 

5205 5269 

5617 5726 

5816 5968 

/6672 



Init 


53802 


54056 


+ 


0 


Intr 


54661 


54805 




0 


Intr 


54887 


54982 


+ 


0 


Intr 


55372 


55448 


+ 


0 


Intr 


55709 


55788 


+ 


0 


Intr 


55934 


55994 


+ 


0 


Intr 


56137 


56204 


+ 


0 


Intr 


56296 


56377 


+ 


0 


Intr 


56467 


56556 


+ 


0 


Intr 


56645 


56722 


+ 


0 


Intr 


56864 


56983 


+ 


0 


Term 


57108 


57474 


+ 


0 



Init 
Intr 
Intr 
Intr 
Intr 
Intr 



2082 

55396 
55709 
55934 
56137 
56296 
56467 



55448 
55788 
55994 
56204 
56377 
56556 



Reference No. 2750-942P 



Intr 
Intr 
Term 



56645 56722 
56864 56983 
57108 57474 



>3399678 

len = 

Sngl 

>3399678 

len = 

Sngl 

>3399678 

len = 

Init 
Intr 
Intr 
Term 



/155960 
513 nex = 
73631 74143 



1427 nex = 

84754 84947 

85039 85251 

85353 85508 

85609 86180 



>3402671 

len = 

Term 
Intr 
Intr 
Intr 
Intr 
Intr 
Intr 
Intr 
Init 



/853 



3156 nex ■■ 



104995 
105259 
105621 
106216 
106605 
106819 
106986 
107186 
107892 



104737 
105204 
105565 
106066 
106557 
106726 
106914 
107086 
107260 



>3402671 

len = 

Init 
Intr 
Intr 
Term 



2213 nex = 

16497 16715 

16919 17386 

17497 17952 

18047 18451 



>3402671 

len = 

Init 
Intr 
Intr 
Term 



733545 



2633 



nex 



21758 21965 

22289 22750 

23024 23482 

23572 23906 



/29616 



60 



len = 



1257 nex = 



6 



Reference No. 2750-942P 



Term 
Intr 
Intr 
Intr 
Intr 
Init 

>3402695 

len = 

Term 
Init 

>3402695 
len = 

>3402695 

len = 

Term 
Init 

>3402695 

len = 

Term 
Intr 
Init 

>3402695 

len = 

Term 
Intr 
Intr 
Intr 
Intr 
Intr 
Intr 
Intr 
Intr 
Intr 
Intr 
Init 

>3402695 

len = 

Sngl 

>3402745 



29445 
29604 
29812 
29980 
30161 
30514 



29258 
29548 
29699 
29930 
30064 
30442 



13030 12578 
13512 13440 



/152917 
653 nex 



44889 
45337 



44516 
45254 



49228 48980 
49706 49346 
49924 49810 



3715 

64962 
65129 
65432 
65870 
66348 
66853 
67168 
67504 
67667 
67847 
68048 
68337 



nex = 

64623 
65078 
65316 
65778 
66129 
66694 
67094 
67317 
67586 
67766 
68014 
68152 



/6727 
89 7 nex = 
70904 71800 
733333 



60 len = 



1510 nex = 



3 



Reference No. 2750-942P 







24071 


23112 




0 




Intr 


24287 


24158 


- 


0 




Init 


24617 


24377 




0 


5 














>3402745 


/21213 








len = 


922 


nex = 


2 




10 


Init 


29567 


29934 




0 






30280 


30488 


+ 


0 




>3402745 


/33975 






15 


len = 


2309 


nex = 


7 






Init 


30883 


31350 


+ 


0 




Intr 


31838 


31908 


+ 


0 




Intr 


31988 


32069 


+ 


0 


20 


Intr 


32169 


32287 


+ 


0 






32549 


32626 


+ 


0 




Intr 


32714 


32833 


+ 


0 




Term 


32923 


33191 


+ 


0 


25 


>3402745 


/10221 








len 


2063 


nex = 


4 






Init 


33374 


33615 


+ 


0 






33835 


34236 




0 




^'^tr 


34556 


34970 




0 




Term 


35046 


35436 




0 




>3402745 


/20924 






35 














len = 


2230 


nex = 


4 






Init 


33384 


33615 


+ 


0 






33835 


34236 


+ 


0 


4 0 


Intr 


34556 


34970 




0 




erm 


35046 


35608 


+ 


0 




>3402745 


736892 






45 


len = 


1308 


nex = 


3 








51257 


50714 




0 




Intr 


51406 


51332 




0 




Init 


52021 


51488 




0 


50 














>3402745 


736836 








len = 


2969 


nex = 


14 




55 


Term 


52493 


52292 




0 




Intr 


52706 


52584 




0 




Intr 


52859 


52793 




0 




Intr 


53091 


53007 




0 




Intr 


53395 


53343 




0 


60 


Intr 


53658 


53611 




0 



Reference No. 2750-942P 



Intr 


53884 


53742 


0 


Intr 


54086 


53980 


0 


Intr 


54266 


54184 


0 


Intr 


54433 


54360 


0 


Intr 


54609 


54529 


0 


Intr 


54759 


54707 


0 


Intr 


54938 


54876 


0 


Init 


55260 


55060 


0 



>3402745 
len = 
Sngl 
>3402745 

Sngl 

>3402745 

len = 

Term 
Intr 
Intr 
Init 

>3402745 

len = 

Init 
Intr 
Intr 
Intr 
Term 



75185 74973 

/35503 

163 8 nex = 

75185 74973 

/31790 

2014 nex = 

76187 75789 

76568 76278 

76777 76655 

77133 76861 



/4621 



2027 

80600 
81288 
81659 
81819 
82176 



80931 
81467 
81740 
82085 
82626 





len = 


588 


nex = 


2 


45 


Init 


82005 


82085 


+ 




Term 


82176 


82592 


+ 




>3402745 


/17760 




50 


len = 


2350 


nex = 


9 




Init 


85225 


85495 


+ 




Intr 


85763 


85943 


+ 




Intr 


86039 


86134 


+ 


55 


Intr 


86229 


86284 


+ 




Intr 


86453 


86536 


+ 




Intr 


86690 


86782 


+ 




Intr 


86876 


86943 


+ 




Intr 


87122 


87195 


+ 


60 


Term 


87314 


87572 


+ 
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2298 



Init 
Intr 
Intr 
Intr 
Intr 
Intr 
Intr 
Intr 
Term 



90176 
90538 
90829 
91060 
91287 
91504 
91646 
91818 
91975 



90278 
90721 
90939 
91115 
91370 
91543 
91713 
91891 
92274 



Init 
Intr 
Term 



2703 3023 
3396 3691 
3790 4023 



>3402747 
len = 



Init 
Term 



3464 3691 
3790 4016 



2240 



Term 
Intr 
Intr 
Intr 
Intr 
Intr 
Intr 
Intr 
Intr 
Init 

>3406034 
len = 
Sngl 

>3406034 
len = 



2322 
2505 
2729 
2889 
3253 
3389 
3580 
3801 
4116 
4365 



2126 
2408 
2583 
2828 
3163 
3336 
3494 
3673 
4046 
4221 



/28528 
775 nex = 
44408 43634 
/102813 
1630 nex = 



Init 
Intr 
Term 



982 1230 
1313 1442 
1995 2193 



60 >3406034 



/35331 
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1690 



Init 
Intr 
Intr 
Term 

>3406034 

len = 

Init 
Intr 
Intr 
Term 

>3406034 

len = 

Term 
Intr 
Intr 
Init 

>3413696 

len = 

Sngl 

>3413696 

len = 

Sngl 

>3413696 

len = 

Term 
Intr 
Intr 
Intr 
Intr 
Intr 
Intr 
Intr 
Init 

>3420042 

len = 

Sngl 



565 898 

982 1230 

1313 1442 

1995 2253 

/2499 

1274 nex = 

66373 66517 

66709 67020 

67284 67348 

67442 67646 

/21669 

1075 nex = 

87265 87035 

87552 87477 

87871 87810 

88109 87956 

72474 

376 nex = 

72418 72793 

/110411 

8 00 nex = 

73023 73822 

739666 

2010 nex = 

74013 73835 

74154 74093 

74343 74235 

74570 74521 

74714 74657 

74902 74825 

75138 75089 

75388 75308 

75844 75677 

/31044 

1153 nex = 

16199 15047 



>3420042 

60 



733789 
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1178 

len = 1543 nex = 4 

Term 49244 48780 - 0 

Intr 49472 49347 - 0 

5 Intr 49668 49615 - 0 

Init 50322 49851 - 0 



>3420042 /33195 



25 >3420042 /6397 



45 >3420043 /22599 



50 



55 >3420043 /36979 



+ 0 
+ 0 
+ 0 



+ 0 

+ 0 

+ 0 

+ 0 



0 



0 
0 
0 
0 
0 



0 



+ 0 



Init 51312 51412 
Intr 51803 51916 
Term 52019 52117 



1630 



Init 51312 51412 

Intr 51803 51916 

Intr 52019 52155 

Term 52279 52540 



len = 

Sngl 

>3420043 

len = 

Term 
Intr 
Intr 
Intr 
Init 

>3420043 

len = 



913 nex = 
58119 57207 
/38891 
3670 



32310 
33332 
33637 
34403 
35464 



nex = 

31803 
33174 
33456 
34232 
35021 



735872 
1066 nex 



len = 1104 nex = 
>3420043 /124634 

len = 2510 nex = 

Sngl 41769 41941 



len = 3413 nex = 8 



Init 56129 56549 
60 Intr 57173 57262 
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Intr 


57809 


57875 


+ 


0 




Intr 


58355 


58447 


+ 


0 




Intr 


58535 


58720 


+ 


0 




Intr 


58809 


58889 


+ 


0 


5 


Intr 


58968 


59105 


+ 


0 




Term 


59204 


59541 


+ 


0 




>3420043 


79257 






10 


len = 


593 


nex = 


1 






Sn 1 


51934 


61342 




0 




>3420043 


/19211 






15 














len = 


1522 


nex = 


6 






Init 


70609 


70725 


+ 


0 




Intr 


70916 


70985 




0 


20 


Intr 


71126 


71173 


+ 


0 




Intr 


71279 


71302 


+ 


0 






71525 


71707 


+ 


0 




Term 


71791 


72130 




0 


2 5 


>3420043 


/206307 








Len = 


1090 


nex = 


5 






Init 


70617 


70725 


+ 


0 


30 


Intr 


70916 


70985 


+ 


0 




Intr 


71126 


71173 




0 




^^^^ 


71279 


71302 








erm 


71525 


71706 




0 


35 


>3420043 


/150229 








len 


1511 




6 






Init 


70617 


70725 


+ 


0 


40 


Intr 


70916 


70985 


+ 


0 




Intr 


71126 


71173 


+ 


0 




Intr 


71279 


71302 




0 




Intr 


71525 


71707 


+ 


0 




Term 


71791 


72127 


+ 


0 


45 














>3420043 


/13487 








len 


1518 


nex = 


0 




50 


>3426033 


/10624 








len 


2499 


nex = 


1 1 






Init 


10293 


10440 


+ 


0 


55 


Intr 


10665 


10760 


+ 


0 




Intr 


10841 


10909 


+ 


0 




Intr 


11001 


11105 


+ 


0 




Intr 


11183 


11239 


+ 


0 




Intr 


11323 


11523 


+ 


0 


60 


Intr 


11684 


11776 


+ 


0 
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Intr 
Intr 
Intr 
Term 



11858 11957 

12082 12181 

12426 12511 

12597 12791 



>3426033 
len = 



10 Init 
Term 



733835 
393 nex 



29182 29215 
29293 29574 



>3426033 

15 len = 

Term 
Intr 
Intr 

2 0 Intr 
Intr 
Intr 
Intr 
Intr 

25 Intr 
Intr 
Init 



3550 

57690 
57870 
58073 
58268 
58579 
59100 
59365 
59760 
60149 
60338 
60965 



57420 
57778 
58007 
58163 
58495 
59009 
59243 
59655 
60043 
60247 
60797 



len ■ 



1643 



nex 



Init 
Intr 

3 5 Intr 

Term 

>3445196 

4 0 len = 

>3445196 
len = 

45 

Init 
Intr 
Intr 
Intr 

50 Intr 
Intr 
Intr 
Intr 
Intr 

55 Intr 
Intr 
Term 



35160 35638 

35732 35801 

36165 36327 

36415 36802 

/121762 



3654 nex 



70556 
70777 
71023 
71180 
71707 
71878 
72121 
72321 
72500 
72754 
72906 
73847 



70702 
70922 
71099 
71302 
71809 
71915 
72219 
72407 
72568 
72825 
73027 
74209 



>3445196 

60 



735 
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len = 


2265 


nex = 


6 




Init 


72121 


72219 


+ 




Intr 


72321 


72407 


+ 


5 


Intr 


72500 


72568 


+ 




Intr 


72754 


72825 


+ 






72906 


73027 






Term 


73847 


74140 


+ 


10 


>3445196 


/3181 






len 


558 


nex = 








82071 


82214 




15 


Term 


82390 


82431 


+ 




>3449311 


/5361 






len = 


1236 


nex = 


2 


20 












Term 


27604 


27255 






Init 


28490 


28213 






>3449311 


/141753 




25 












len = 


1072 


nex = 


4 




Term 


47195 


46888 






Intr 


47404 


47285 


- 


30 


Intr 


47631 


47489 






Init 


47959 


47723 






>3449311 


/30287 




35 


len = 


1117 


nex = 


4 




Term 


47195 


46889 






Intr 


47404 


47285 






Intr 


47631 


47489 




40 


Init 


48005 


47723 






>3449311 


/121388 






len = 


2530 




5 


45 












Init 


51403 


51752 


+ 




Intr 


51838 


51965 






Intr 


52072 


52305 






Intr 


53291 


53537 




50 


Term 


53646 


53926 






>3449311 


/21192 






len = 


2530 


nex = 


5 


55 












Init 


51403 


51752 


+ 




Intr 


51838 


51965 


+ 




Intr 


52072 


52305 


+ 




Intr 


53291 


53537 


+ 


60 


Term 


53646 


53931 


+ 
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len = 1491 nex = 

Init 20409 20522 

Intr 20834 20899 

Intr 21007 21151 

Intr 21246 21464 

Intr 21559 21772 

Term 21862 21890 



Init 35173 35268 

intr 35340 35570 

Intr 35843 35940 

20 Intr 36031 36217 

Intr 36303 36433 

Intr 36515 36751 

Term 36841 37327 

25 >3449312 /2159 

len = 881 nex = 

Sngl 40080 40960 

30 

>3449312 /11266 

len = 2506 nex = 

35 Init 70384 70926 

Intr 71299 71633 

Intr 71731 71930 

Term 72582 72889 

40 >3449312 /37540 

len = 2540 nex = 

Init 8228 8379 

45 Intr 8897 8980 

Intr 9081 9301 

Intr 9387 9465 

Intr 9654 9823 

Intr 9965 10151 

50 Intr 10243 10320 

Term 10403 10767 

>3449313 /40718 

55 len = 743 nex = 

Term 17412 16864 

Init 17606 17499 



60 >3449313 



76699 



Reference No. 2750-942P 



2 2 91 nex 



Term 
Intr 
init 

>3449313 

len = 

Init 
Intr 
Term 

>3449313 

len = 

Sngl 

>3449314 

len = 

Sngl 

>3449314 

len = 

Term 
Init 

>3449315 

len = 

Term 
Intr 
Intr 
Intr 
Init 

>3449316 

len = 

Sngl 

>3449316 

len = 

Sngl 

>3449317 

len = 



19583 18826 
20255 19887 
21116 20331 

/31573 

1964 nex = 

27831 28026 
28183 28618 
29403 29794 

/147765 

831 nex = 

55558 54728 

/25475 

69 8 nex = 

1344 647 

76982 

1359 nex = 

6894 6520 
7878 7794 

/29551 

15 6 0 nex = 

2958 2620 

3194 3059 

3363 3280 

3509 3452 

4179 3608 

/107101 

391 nex = 

46435 46825 

/17050 

47 7 nex = 

8487 8751 

73729 

26 98 nex = 
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Term 
Intr 
Intr 
Intr 
Intr 
Intr 
Intr 
Intr 
Init 

>3449317 

len = 



36464 
36675 
36858 
37059 
37343 
37908 
38239 
38401 
38713 



36043 
36563 
36770 
36954 
37287 
37819 
38014 
38368 
38596 



Init 
Intr 
Term 

>3449317 

len = 

Init 
Intr 
Term 

>3449317 

len = 

Init 
Intr 
Intr 
Intr 
Intr 
Intr 
Term 

>3449318 

len = 

Term 
Intr 
Init 

>3449320 

len = 

Init 
Intr 
Term 

>3449320 

len = 



/2842 

16 64 nex = 

38942 39208 
39996 40060 
40346 40605 



5652 5704 
5785 5868 
5959 6289 



2140 nex = 

6801 6845 

7257 7326 

7438 7656 

7745 7824 

7917 8033 

8150 8454 

8552 8940 

/1523 

931 nex = 

21868 21538 

22060 21983 

22468 22163 

/31551 

954 nex = 

22442 22860 

23003 23177 

23304 23395 

/14044 

1603 nex = 



Init 37144 37440 
60 Intr 38083 38151 
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Intr 
Term 

>3449320 

len = 

Init 
Intr 
Intr 
Term 

>3449320 
len = 
Sngl 

>3449320 

len = 

Init 
Term 

>3449321 
len = 
Sngl 

>3449321 

len = 

Term 
Init 

>3449321 

len = 

Term 
Init 

>3449321 

len = 

Init 
Intr 
Intr 
Term 

>3449322 

len = 



38254 38360 
38471 38745 

727973 

1526 nex = 

48903 49084 

49585 49623 

49797 50027 

50116 50428 

/12322 

524 nex = 

60298 59775 

74875 

984 nex = 

62405 62653 
63001 63388 

732212 

63 8 nex = 

19476 20102 

76441 

2658 nex = 

33609 32653 
35310 34793 

727795 

850 nex = 

36545 36330 
37179 36861 

714414 

1495 nex = 

43130 43391 

43495 43751 

43847 43993 

44204 44624 

7149970 

1056 nex = 



Term 15835 15245 
60 Init 16155 15991 
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>3449322 
len = 
Sngl 

>3449323 

len = 

Init 
Intr 
Term 

>3449323 

len = 

Init 
Term 

>3449323 
len = 
Sngl 

>3449323 

len = 

Term 
Intr 
Intr 
Init 

>3449323 
len = 

>3449323 

len = 

Term 
Intr 
Intr 
Intr 
Intr 
Intr 
Init 

>3449323 

len = 



/970 
49 6 nex = 
994 499 

/35890 

1873 nex = 

27590 27970 
28290 28552 
28595 28757 

/17159 

1630 nex = 

32944 33507 
34103 34565 

/21382 

276 nex = 

34337 34612 

742747 

2074 nex = 

35032 34594 

35437 35219 

35698 35519 

36667 36397 

/39065 

869 nex = 

/13801 

2151 nex = 

52032 51755 

52225 52124 

52436 52311 

52789 52529 

53298 52875 

53623 53505 

53905 53724 

/122618 

1210 nex = 



Term 69206 68963 
Intr 69709 69545 
Init 69969 69811 
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/13730 

1956 nex = 

71264 70606 
71787 71394 

/19092 

710 nex = 



>3449323 

len = 

Term 
Init 

>3449324 

len = 

Init 
Intr 
Term 

>3449325 

len = 

Term 
Intr 
Init 

>3449325 

len = 

Term 
Intr 
Init 

>3449325 

len = 

Term 
Intr 
Init 

>3449326 

len = 

Term 
Init 

>3449326 

len = 

Init 
Term 

>3449326 

len = 



22002 
22241 
22512 



34360 
34566 
35207 



34360 
34566 
35271 



40783 
41008 
41330 



22115 
22398 
22711 



33685 
34431 
34772 



33734 
34431 
34772 



40075 
40873 
41098 



/119717 

656 nex = 

23133 22788 
23443 23373 

/3086 

1090 nex = 

46084 46198 
46486 47171 

742926 

2685 nex = 



/21890 
1523 nex = 



742965 
1538 nex = 



713806 
1256 nex = 



Term 61606 61430 
Intr 61803 61717 
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Intr 
Intr 
Intr 
Intr 
Intr 
Intr 
Init 

>3449326 

len = 

Init 
Term 

>3449327 
len = 
Sngl 

>3449327 

len = 

Term 
Intr 
Intr 
Intr 
Intr 
Intr 
Intr 
Intr 
Intr 
Init 

>3449327 

len = 

Init 
Intr 
Intr 
Term 

>3449327 
len = 
Sngl 

>3449327 

len = 

Term 
Intr 
Init 



61962 
62569 
62770 
62992 
63168 
63494 
64114 



61888 
62465 
62672 
62859 
63090 
63276 
63864 



71680 72166 
72722 72974 



14331 13937 
/37055 
2110 nex = 



20457 
20670 
20855 
21036 
21174 
21586 
21744 
21920 
22057 
22225 



20120 
20567 
20756 
20939 
21122 
21515 
21673 
21825 
21990 
22134 



727492 

1820 nex = 

26052 26288 

26367 26724 

26809 27384 

27475 27871 

/11250 
1693 nex = 
35338 34719 

/157644 

1090 nex = 

5305 5089 
5612 5397 
5960 5886 



60 >3449327 



/14401 
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1180 nex = 



Term 
Intr 
Init 

>3449327 

len = 

Term 
Intr 
Init 

>3449327 
len = 
Sngl 

>3449327 

len = 

Init 
Intr 
Term 

>3449327 

len = 

Init 
Intr 
Term 

>3449327 

len = 

Init 
Intr 
Term 

>3449329 

len = 

Sngl 

>3449329 

len = 

Sngl 

>3449329 

len = 



5305 5037 
5612 5397 
5960 5886 

738345 

149 6 nex = 

61922 61607 
62622 62493 
63102 62707 

/12989 
1634 nex = 
65098 64162 

739633 

1858 nex = 

65970 66124 
66222 66328 
67542 67827 

73226 

1630 nex = 

7695 7813 
7943 8038 
8118 8562 

714416 

1425 nex = 

7695 7813 
7943 8038 
8118 8560 

799348 

771 nex = 

22128 21358 

739355 

456 nex = 

22146 21709 

734273 

1101 nex = 
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Sngl 
>3449329 

5 

len = 

Init 
Intr 

10 Intr 
Intr 
Term 



>3449329 



15 



len = 
Sngl 

20 >3449329 
len = 
Sngl 

25 

>3449329 

3 0 Init 
Intr 
Term 



Term 
Init 



>3449330 

len = 

45 Term 
Intr 
Init 



Term 
Intr 

55 Init 
>3449331 
len = 



3712 2612 

/33707 

16 99 nex = 

47513 47739 

47826 48010 

48122 48322 

48410 48571 

48642 49211 

/117597 

1750 nex = 

47513 47739 

/112975 

656 nex = 

77611 76956 

/7878 

2318 nex = 

79072 79208 
79974 80125 
80277 81389 

/110801 

575 nex = 

83763 83571 
84145 84015 

/11513 

1658 nex = 



63037 62811 
63717 63115 
64468 64216 

/18210 

12 46 nex = 

14046 13722 
14527 14422 
14967 14798 

735596 

1469 nex = 
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Init 


16127 


16294 








Term 


17028 


17595 








>3449331 


/6591 






5 














len = 


2293 


nex = 


2 






Init 


18665 


19043 








Term 


19547 


20106 






1 0 














>3449331 


/31901 








len 


2110 


nex = 






15 


Term 


55268 


55105 










55402 


55362 








Intr 


55630 


55500 








Intr 


55809 


55716 


_ 


0 




Intr 


55980 


55903 




0 


20 


Intr 


56162 


56074 




0 




Intr 


56315 


55251 


- 


0 




Intr 


56488 


56397 








Intr 


56676 


56589 








Intr 


56832 


56746 






2 5 


Init 


57213 


57145 








>3449331 


/7901 








len = 


2453 


nex = 






3 0 














Term 


55268 


54818 








Intr 


55402 


55362 








Intr 


55630 


55500 








Intr 


55809 


55716 


_ 


0 


35 


Intr 


55980 


55903 




0 




Intr 


56162 


56074 


: 


0 




Intr 


56315 


56251 


- 


0 




Intr 


56488 


56397 




0 




Intr 


56676 


56589 


- 


0 


40 


Intr 


56832 


56746 








Init 


57270 


57145 








>344933 1 


/6701 






45 


len = 


2621 


nex = 








Term 


55268 


54844 








Intr 


55402 


55362 










55630 


55500 






50 


Intr 


55809 


55716 




0 




Intr 


55980 


55903 




0 




Intr 


56162 


56074 




0 




Intr 


56315 


56251 




0 




Intr 


56488 


56397 




0 


55 


Intr 


56676 


56589 




0 




Intr 


56832 


56746 




0 




Init 


57464 


57342 




0 



>3449331 

60 



/24060 



Reference No. 2750-942P 



Len = 1669 nex = 

Term 59217 58869 

Intr 59381 59304 

Intr 59645 59545 

Intr 59840 59743 

Init 60337 60235 



>3449331 

len = 

Init 
Term 

>3449331 

len = 

Term 
Init 

>3449331 

len = 

Init 
Term 

>3449331 

len = 

Init 
Term 

>3449331 

len = 

Init 
Term 

>3449331 
len = 
Sngl 

>3449332 
len = 

>3449332 
len = 



/19444 

701 nex = 

62474 62588 
62729 62796 

/43057 

653 nex = 

64577 64471 
65123 64716 

77828 

713 nex = 

68243 68660 
68755 68955 

/16326 

702 nex = 

68246 68660 
68755 68947 

728773 

69 9 nex = 

68251 68660 
68755 68949 

794898 

310 nex = 

68272 68572 

731894 

1677 nex = 

737326 

2911 nex = 



Term 30818 30775 
Intr 31080 30961 
Intr 31445 31341 



Reference No. 2750-942P 



Intr 
Intr 
Intr 
Intr 
Intr 
Intr 
Intr 
Intr 
Intr 
Init 



31611 
31753 
31902 
32046 
32231 
32404 
32674 
32859 
32971 
33267 



>3449333 

len = 

Sngl 

>3449333 

len = 

Sngl 

>3449334 

len = 

Term 
Intr 
Init 

>3449334 
len = 
Sngl 

>3449334 

len = 

Init 
Intr 
Intr 
Intr 
Intr 
Term 

>3449334 

len = 

Init 
Intr 
Term 



31534 
31688 
31828 
31984 
32142 
32308 
32577 
32748 
32946 
33058 



/37806 

1510 nex = 

32409 33912 

/39990 

1218 nex = 

35415 34198 

737422 

1129 nex = 

10778 10442 
10889 10867 
11570 11474 

/11092 

4 32 nex = 

38534 38322 

/21208 

1390 nex = 



48947 
49267 
49516 
49742 
49871 
50057 



49088 
49375 
49593 
49775 
49939 
50336 



/12613 

1460 nex = 

59084 59287 
59884 60153 
60375 60543 



len = 

60 



550 nex = 



2 



Reference No. 2750-942P 



Init 
Term 



6629 6820 
6919 7170 



>3449334 
len = 



Init 
Term 



>3449334 
len = 



76477 
1114 nex = 



75520 75670 
76364 76633 



/22052 
503 nex = 



Init 
Term 



77334 77521 
77615 77836 



>3449334 
len = 



/154063 
1339 nex ^ 



Term 
Intr 
Intr 
Intr 
Init 



78133 77804 

78297 78209 

78506 78373 

78651 78601 

79142 78752 



>3449334 
len = 



74763 



Init 
Term 



80747 81302 
81744 81782 



>3451055 
len = 



1589 



Init 
Intr 
Intr 
Term 



32795 33172 

33446 33741 

33836 33976 

34110 34383 



>3451055 
len = 



74578 



2544 nex 



Init 
Intr 
Intr 
Intr 
Intr 
Intr 
Intr 
Term 



41121 
42072 
42380 
42577 
42805 
42968 
43172 
43361 



41326 
42303 
42459 
42730 
42883 
43055 
43218 
43664 



len = 

60 



2451 nex = 



Reference No. 2750-942P 



Init 
Intr 
Intr 
Intr 
Intr 
Intr 
Intr 
Term 



41200 
42072 
42380 
42607 
42805 
42968 
43172 
43361 



41326 
42303 
42459 
42730 
42883 
43055 
43218 
43650 



>3451055 
len = 



2230 



Init 
Intr 
Intr 
Intr 
Intr 
Intr 
Intr 
Term 



61121 
61520 
61868 
62011 
62316 
62528 
62751 
62860 



61435 
61541 
61924 
62092 
62422 
62652 
62769 
63348 



>3451055 
len = 



Term 
Init 



70038 69729 
70454 70123 



506 



nex = 



Sngl 

>3461810 

len = 

Term 
Intr 
Intr 
Intr 
Intr 
Intr 
Intr 
Init 



78727 79232 
/11805 
2369 nex = 



42733 
42938 
43131 
43376 
43531 
43911 
44222 
44691 



42323 
42849 
43045 
43226 
43464 
43840 
44021 
44486 



>3461810 
len = 



1289 



nex 



Init 
Intr 
Intr 
Term 



53744 
54142 
54403 
54682 



54065 
54297 
54601 
55032 



60 len = 



1243 nex = 



2 



Reference No. 2750-942P 



Term 7 06 7 
Init 8024 



6782 
7769 



Term 97905 97123 
Intr 98453 98141 
Init 98676 98543 



>3461834 
len = 



/17914 
1906 nex 



Term 
Intr 
Intr 
Init 

>3461834 
len = 
Sngl 

>3461834 

len = 

Init 
Intr 
Intr 
Term 

>3461834 

len = 

Init 
Intr 
Intr 
Intr 
Intr 
Term 

>3482964 



11675 11426 

11885 11695 

12878 12560 

13331 13074 

/38091 



1913 

49465 
50347 
50887 
51126 



nex = 

49733 
50450 
51016 
51377 



2230 nex = 

57851 58312 

58588 58751 

58845 59008 

59093 59352 

59440 59566 

59735 60074 



739928 



2424 



nex 



Term 
Intr 
Intr 
Intr 
Intr 
Intr 
Init 



23256 
23516 
23815 
23978 
24191 
24802 
24971 



22843 
23396 
23590 
23891 
24068 
24731 
24885 



60 >3482964 



/40987 



Reference No. 2750-942P 



1881 



nex ■■ 



Init 
Intr 
Intr 
Intr 
Intr 
Term 



27474 
27982 
28285 
28664 
28805 
29067 



27850 
28122 
28425 
28713 
28934 
29354 



/31673 



Sngl 
>3482964 



2 63 nex = 
5556 5818 
/14105 



1707 



nex ■■ 



Init 
Intr 
Term 

25 >3482964 



77290 77640 
78070 78357 
78446 78996 



1403 



Terra 
Intr 
Intr 
Intr 
Intr 
Init 



7020 
7158 
7297 
7457 
7801 
8013 



6611 
7104 
7245 
7381 
7552 
7897 



1611 nex = 



Term 
Intr 
Intr 
Intr 
Intr 
Intr 
Init 



7020 
7158 
7297 
7457 
7801 
8050 
8205 



6595 
7104 
7245 
7381 
7552 
7897 
8138 



>3482964 

len = 

Term 
Intr 
Intr 
Intr 
Intr 
Intr 
Intr 
Init 



2668 



nex 



7020 6610 

7158 7104 

7297 7245 

7457 7381 

7801 7552 

8050 7897 

8246 8138 

8481 8418 



Reference No. 2750-942P 





len = 


4990 


nex = 


16 




5 


Init 


25848 


25975 


+ 


0 






26062 


26140 


+ 


0 




Intr 


26270 


26331 


+ 


0 




Intr 


26602 


26716 


+ 


0 




Intr 


26811 


26886 


+ 


0 


10 


Intr 


26966 


27061 


+ 


0 




Intr 


27278 


27322 


+ 


0 




Intr 


27671 


27727 


+ 


0 




Intr 


27827 


27982 


+ 


0 




Intr 


28080 


28145 


+ 


0 


15 




28647 


28770 




0 




Intr 


29541 


29654 


+ 


0 




Intr 


29769 


29846 


+ 


0 




Intr 


30008 


30055 


+ 


0 




Intr 


30266 


30334 


+ 


0 


20 


Term 


30495 


30835 


+ 


0 




>3492855 


/15984 








len = 


850 


nex = 


0 




2 5 














>3 492 855 


/31772 








len = 


380 


nex = 


1 




30 


Sngl 


67588 


67967 


+ 


0 




>3492855 


/42247 








len = 


502 


nex = 


1 




3 5 














Sn 1 
ng 


67588 


68089 


+ 


0 




>3492855 


/94363 






40 


len = 


388 


nex = 


1 






Sngl 


67594 


67981 


+ 


0 




>3492855 


/12727 






45 














len 


698 


nex = 


1 






Sngl 


67594 


68291 


+ 


0 


50 


>3492855 


75339 








len = 


262 




1 






Sngl 


67599 


67860 


+ 


0 


55 














>3492855 


78572 








len = 


337 


nex = 


1 





60 Sngl 



67621 67957 



0 



Reference No. 2750-942P 



len = 

5 

Sngl 
>3492855 
10 len = 

Sngl 
>3492855 

15 

len = 
Sngl 

20 >3510247 
len = 
Sngl 

25 

>3510247 
len = 

3 0 Sngl 

>3510247 
len = 

35 

Init 
Intr 
Intr 
Intr 

4 0 Intr 

Term 

>3510247 

45 len = 

Init 
Intr 
Term 

50 

>3510336 

len = 

55 Term 
Intr 
Intr 
Intr 
Intr 



734643 

692 nex = 

67621 68312 

/8142 

698 nex = 

67639 68336 

/113357 

351 nex = 

67958 68308 

/105240 

670 nex = 

3626 4289 

/142022 

226 nex = 

39517 39742 

726967 

1512 nex = 

42812 43093 

43184 43305 

43393 43479 

43565 43627 

43886 43923 

44021 44323 

/26134 

574 nex = 

42884 43093 
43184 43305 
43393 43457 

738743 

2275 nex = 

10758 10440 

11079 10844 

11501 11396 

11676 11590 

12031 11813 

12289 12119 



Reference No. 2750-942P 





Init 


12714 


12390 


- 


0 




>3510336 


729476 






5 


len = 


610 


nex = 


2 






Init 


23119 


23477 


+ 


0 




Term 


23566 


23719 


+ 


0 


10 


>3510336 


/18108 








len 


762 


nex = 


2 






Term 


40411 


40034 


_ 


0 


1 5 


Init 


40795 


40541 




0 




>3510337 


/21311 








len = 


3939 


nex = 


6 




20 














Init 


1201 


1383 


+ 


0 




Intr 


1756 


1821 


+ 


0 




Intr 


3560 


4202 




0 




Intr 


4287 


4390 


+ 


0 


25 


Intr 


4470 


4608 


+ 


0 




Term 


4752 


5139 


+ 


0 




>3510338 


726524 






30 


len = 


686 


nex = 


1 






ng 


7586 


8271 


+ 


0 




>35 10339 


/36971 






? c, 














en 


2077 


nex = 


7 






Term 


13327 


13214 


_ 


0 




Intr 


13556 


13446 


- 


0 


40 


Intr 


13746 


13663 




0 




Intr 


13906 


13838 


- 


0 




Intr 


14081 


14016 




0 




Intr 


14316 


14191 


- 


0 




Init 


14804 


14412 




0 


45 














>3510339 


726265 








len 


310 


nex = 


1 




50 


Sngl 


18302 


17998 




0 




>3510339 


73732 








len = 


832 


nex = 


1 




55 














Sngl 


18846 


18015 




0 



60 len = 1665 nex = 



Reference No. 2750-942P 



Init 
Intr 
Intr 
a?erm 

>3510339 

len = 



25257 25329 

25742 25813 

26485 26565 

26649 26921 

/101308 



Init 
Intr 
Term 

>3510339 

len = 



27964 28382 
28442 28624 
28735 28965 



42 44 nex ■ 



Init 
Intr 
Intr 
Intr 
Intr 
Intr 
Intr 
Intr 
Intr 
Intr 
Intr 
Intr 
Intr 
Intr 
Intr 
Intr 
Term 



32101 
32424 
32620 
33398 
33677 
34028 
34225 
34363 
34484 
34705 
34944 
35123 
35275 
35459 
35654 
35780 
36039 



32339 
32541 
32769 
33508 
33760 
34126 
34275 
34412 
34565 
34799 
35010 
35173 
35370 
35557 
35683 
35845 
36344 



len 



1225 



Init 
Intr 
Intr 
Intr 
Intr 
Term 



35127 35173 

35275 35370 

35459 35557 

35654 35683 

35780 35845 

36039 36351 

/7341 

1630 nex = 



Term 
Intr 
Intr 
Intr 
Init 



2222 1970 

2399 2316 

2651 2492 

3073 3025 

3595 3310 

/37148 

1831 nex = 



Reference No. 2750-942P 



Init 
intr 
Term 

>3510339 
len = 
Sngl 

>3510339 

len = 

Init 
Intr 
Intr 
Term 

>3510339 

len = 

Init 
Intr 
Intr 
Intr 
Term 

>3510339 
len = 
Sngl 

>3510340 

len = 

Term 
Init 

>3510340 

len = 

Term 
Intr 
Intr 
Intr 
Intr 
Intr 
Init 

>3510340 

len = 



36683 36792 
36895 36995 
37400 38513 

/4190 

570 nex = 

37975 38544 

727885 

910 nex = 

47055 47312 

47405 47463 

47558 47675 

47755 47961 

732353 

1336 nex = 

53149 53304 

53397 53504 

53574 53700 

53970 54038 

54133 54484 

714885 

688 nex = 

8005 7318 

7126460 

934 nex = 

21669 21324 
22257 21750 

740096 

229 0 nex = 

38031 37656 

38245 38120 

38564 38358 

38726 38658 

38905 38813 

39170 38997 

39945 39536 

714006 

3759 nex = 



60 



Term 



41382 41098 



0 



Reference No. 2750-942P 



Intr 
Intr 
Intr 
Intr 
Intr 
Intr 
Intr 
Intr 
Intr 
Intr 
Intr 
Intr 
Intr 
Intr 
Intr 
Intr 
Init 

>3510341 
len = 

>3510341 
len = 
Sngl 

>3510341 

len = 

Term 
Intr 
Intr 
Intr 
Intr 
Intr 
Intr 
Intr 
Init 

>3510341 

len = 

Term 
Intr 
Init 

>3510342 

len = 

Sngl 

>3510342 



41552 
41808 
42179 
42405 
42585 
42747 
42944 
43109 
43297 
43431 
43733 
43878 
44054 
44229 
44386 
44577 
44856 



41486 
41760 
42123 
42335 
42494 
42675 
42870 
43045 
43210 
43387 
43614 
43840 
43971 
44138 
44312 
44484 
44655 



/10879 
1930 nex = 
/36005 
691 nex = 
40942 41632 
/32092 
2099 nex = 



48160 
48310 
48500 
48656 
48836 
49030 
49183 
49608 
49934 



47836 
48244 
48383 
48500 
48741 
48923 
49114 
49481 
49746 



/18565 

1332 nex = 

55802 55320 
56319 56196 
56651 56408 

/21714 

1319 nex = 

54886 56204 

/24640 



len = 

60 



730 nex = 



3 



Reference No. 2750-942P 



Init 
Intr 
Term 

5 >3510342 

len = 

Sngl 

10 

>3510342 
len = 
15 Sngl 

>3510343 
len = 

20 

Init 
Intr 
Intr 
Intr 

2 5 Intr 
Intr 
Term 



30 



>3510343 



len = 
Sngl 

35 >3510343 
len = 
Sngl 

40 

>3510343 
len = 

4 5 Term 

Intr 
Intr 
Intr 
Intr 

5 0 Intr 

Intr 
Intr 
Init 

55 >3510343 
len = 



56678 56815 
56912 56980 
57073 57398 

/123863 

910 nex = 

60334 61240 

/534 

53 6 nex = 

9075 8540 

/40742 

1891 nex = 

13415 13697 

13866 14012 

14098 14196 

14285 14476 

14612 14731 

14818 14889 

14970 15305 

/24171 

1330 nex = 

2256 3585 

72722 

503 nex = 

30214 29712 

/106836 

1702 nex = 

33894 33628 

34089 33998 

34258 34188 

34371 34329 

34518 34462 

34704 34605 

34927 34777 

35103 35021 

35329 35222 

/6903 

1763 nex = 



Init 37722 38574 
60 Intr 38655 38839 



Reference No. 2750-942P 



Intr 
Term 



Init 
Intr 

1 0 Term 
>3510343 
len = 

15 

Term 
Intr 
Init 

20 >3510343 
len = 
Sngl 

25 

>3510343 
len = 

3 0 Term 

Intr 
Intr 
Intr 
Intr 

35 Intr 
Intr 
Intr 
Intr 
Intr 

4 0 Intr 

Init 

>3510343 
45 len = 

Sngl 
>3510343 

50 

len = 
Sngl 

55 >3510343 
len = 



38929 39039 
39124 39484 

/347 

1651 nex = 

4180 4489 
5552 5687 
5778 5830 

/97314 

1179 nex = 

41393 41220 
41794 41558 
42398 42147 

/29713 

614 nex = 

43022 42748 

742932 

3299 nex = 

44028 43691 

44238 44168 

44426 44307 

44873 44679 

45135 45071 

45321 45231 

45556 45495 

45732 45682 

45918 45817 

46113 46006 

46273 46210 

46989 46641 

/27758 

172 6 nex = 

48931 50656 

/41873 

1120 nex = 

49483 50602 

/103302 

34 6 nex = 



Sngl 50315 50660 

60 



0 



Reference No. 2750-942P 



>3510343 /11988 

len = 2000 nex = 

Sngl 53425 52376 

>3510343 /40911 

len = 533 nex = 

Term 54279 53987 

Init 54519 54367 

>3510343 /4645 

len = 2130 nex = 

Init 68383 68789 

Intr 69020 69154 

Intr 69249 69327 

Intr 69422 69516 

Intr 69735 69902 

Term 69982 70512 

>3510343 /40013 

len = 219 nex = 

Sngl 77830 77612 

>3510343 /4267 

len = 790 nex = 

Sngl 82643 83428 

>3510344 /2679 

len = 511 nex = 

Sngl 43185 43695 

>3510345 738935 

len = 1870 nex = 

Term 20537 19806 

Intr 20684 20623 

Intr 20927 20816 

Intr 21079 21017 

Init 21667 21540 

>3510345 /16321 

len = 1839 nex = 

Init 30949 31052 

Intr 31463 31499 

Intr 31947 32020 

intr 32124 32217 



Reference No. 2750-942P 



Intr 
Term 

>3510345 

Init 
Intr 
Intr 
Intr 
Term 



Init 
Intr 
Intr 
Intr 
Intr 
Intr 
Term 

>3510345 

len = 

Init 
Intr 
Intr 
Intr 
Intr 
Intr 
Intr 
Intr 
Intr 
Intr 
Intr 
Term 

>3510345 

len = 

Term 
Intr 
Intr 
Init 

>3510346 

len = 

Sngl 

>3510346 



32330 32412 
32494 32787 



31463 31499 

31947 32020 

32124 32217 

32330 32412 

32494 32923 



/40590 



2650 

36093 
36588 
37250 
37733 
37917 
38223 
38470 



36290 
37160 
37316 
37824 
38126 
38360 
38726 



3010 nex 



61602 
62116 
62321 
62503 
62663 
62944 
63156 
63328 
63507 
63687 
63877 
64039 



61918 
62244 
62413 
62582 
62776 
63017 
63232 
63411 
63599 
63794 
63948 
64603 



3836 nex = 

82161 81264 

82858 82787 

83396 83347 

84608 84277 

/34019 

4 65 nex = 

19046 18582 

/3859 



6 0 len = 



58 3 nex = 



1 



Reference No. 2750-942P 



Sngl 
>3510346 

5 

len = 

Term 
Intr 

10 Intr 
Init 

>3510346 

15 len = 

Sngl 

>3510346 

len = 

Init 
Term 



20 



>3510346 
len = 



Init 
Term 



>3510347 

35 len = 

Term 
Intr 
Intr 

40 Intr 
Intr 
Init 

>3510347 

45 

len = 

Term 
Intr 

5 0 Intr 
Intr 
Init 

>3510347 

55 

len = 
Sngl 



19162 18580 

/41105 

2655 nex = 

21838 21322 

22141 21929 

22425 22322 

23176 22513 

/18886 
1400 nex = 
2565 1166 

/6830 

1411 nex = 

28915 29605 
29936 30325 

/19036 

925 nex 

29413 29605 
29936 30323 

732287 

1572 nex = 

17536 17338 

17706 17628 

17947 17881 

18159 18060 

18386 18246 

18909 18664 

/10832 

1590 nex = 

17536 17344 

17706 17628 

17947 17881 

18159 18060 

18386 18246 

78265 

1462 nex = 

25560 25814 



60 >3510347 



737190 



Reference No. 2750-942P 



2683 



nex 



Init 
Intr 
Intr 
Intr 
Intr 
Intr 
Intr 
Term 

>3510347 

len = 



26268 
27118 
27478 
27755 
27939 
28258 
28413 
28719 



26469 
27309 
27554 
27814 
28142 
28320 
28481 
28950 



149 7 nex 



Term 
Intr 
Intr 
Intr 
Intr 
Init 



32966 
33122 
33434 
33570 
33921 
34260 



32764 
33065 
33389 
33535 
33809 
34059 



>3510347 

len = 

Term 
Intr 
Intr 
Intr 
Intr 
Intr 
Intr 
Init 

>3510347 
len = 
Sngl 

>3510347 

len = 

Init 
Intr 
Intr 
Term 



/6 

2248 

35112 
35312 
35520 
35687 
36105 
36365 
36591 
36889 



nex = 

34642 
35190 
35412 
35606 
35898 
36207 
36457 
36687 



/40282 
1030 nex = 
44145 43121 

/21002 

1112 nex = 

44576 44637 

44840 44941 

45017 45138 

45230 45687 



>3510347 
len = 



/8259 
911 nex ■■ 



Init 
Term 



46905 47248 
47327 47815 



>3510347 

60 



/101505 



Reference No. 2750-942P 



len 



1259 nex = 



Init 
Intr 
Intr 
Intr 
Term 

>3510347 

len = 



49449 49599 

49835 50106 

50195 50285 

50364 50430 

50515 50707 

/2528 

1377 nex = 



Term 
Init 



>3510347 
len = 



57801 57178 
58554 58264 



/112017 
914 nex ■■ 



Term 
Init 



>3510347 
len = 



57801 57641 
58554 58264 



/21882 
2650 nex ^ 



Init 
Intr 
Term 

>3510347 

len = 



60423 61009 
61588 61725 
62599 63070 

/93510 

1477 nex = 



Init 
Intr 
Term 



64300 64751 
64887 65024 
65353 65776 



>3510347 
len = 



/25205 
2153 nex 



Init 
Intr 
Intr 
Intr 
Term 

>3510347 

len = 

Term 
Init 



68370 68963 

69054 69194 

69282 69422 

69950 70056 

70149 70522 

/lO 

731 nex = 

6348 6208 

6938 6424 



>3510347 
len = 
Init 



/20808 
1651 nex = 
72240 72762 



Reference No. 2750-942P 



Intr 
Intr 
Term 



72870 73010 
73158 73298 
73581 73890 



>3510347 
len = 



73268 
670 nex 



Term 
Init 



74338 73874 
74537 74427 



>3510347 
len = 



2512 



nex 



Init 
Intr 
Intr 
Intr 
Intr 
Term 



8462 
9019 
9299 
10221 
10414 
10655 



8802 
9104 
9349 
10287 
10493 
10973 



>3513725 
len = 



/36053 
1750 nex = 



Init 
Intr 
Intr 
Intr 
Intr 
Intr 
Term 



13019 
13469 
13779 
13897 
14038 
14282 
14473 



13159 
13609 
13816 
13947 
14205 
14394 
14764 



>3513725 
len = 



Init 
Intr 
Intr 
Intr 
Intr 
Term 



25174 
25648 
25798 
25990 
27220 
27422 



25560 
25712 
25884 
26129 
27274 
27795 



>3513725 
len = 



2650 nex 



Init 
Intr 
Intr 
Intr 
Intr 
Term 



25215 25560 

25648 25712 

25798 25884 

25990 26129 

27220 27274 

27422 27859 

/15608 



len = 

60 



2626 nex = 



9 



Reference No. 2750-942P 





Init 


28826 






















Intr 


29928 


30063 








Intr 


30157 


30258 


+ 


0 


5 


Intr 


30340 


30567 


+ 


0 




Intr 


30642 


30720 


+ 


0 




Intr 


30807 


30889 


+ 


0 






30975 


31066 




0 




Term 


31194 


31451 






10 














>3513725 


/11425 








len 


1812 


nex = 






15 


Init 


430QQ 


43341 


+ 


0 






43510 


43706 








Term 


44140 


44588 








>3513725 


/4588 






20 














len = 


1847 


nex = 


3 






Init 


42785 


43341 








Intr 


43510 


43706 






25 


Term 


44140 


44631 








>3513725 


/19449 








len = 


1469 


nex = 


3 




30 














Init 


48183 


48572 


+ 


0 




Intr 


48751 


48856 




^ 




Term 


49391 


49651 






35 


>3513725 
>3513725 


/33161 








len = 


1990 


nex = 


5 






Term 


70844 


70396 




0 


40 


Intr 


71145 


71026 


: 


0 




Intr 


71729 


71348 


- 


0 




Intr 


72167 


72131 




^ 




Init 


72376 


72258 






45 


>3513725 


72538 








len 


2710 


nex = 








Term 


88530 


88065 


_ 


0 


50 


Intr 


88919 


88628 


- 


0 




Intr 


89919 


89306 




0 




Init 


90771 


90027 




0 




>3522932 


/1233 






55 














len = 


468 


nex = 


1 






Sngl 


14882 


14415 




0 



60 >3522932 



76247 



Reference No. 2750-942P 



len = 

Sngl 

>3522932 

len = 

Sngl 

>3522932 

len = 

Sngl 

>3522932 

len = 

Sngl 

>3522932 

len = 

Init 
Term 

>3522932 

len = 

Term 
Intr 
Intr 
Intr 
Intr 
Intr 
Intr 
Init 

>3548797 
len = 
Sngl 

>3548797 

len = 

Term 
Intr 
Intr 
Intr 
Init 



629 nex = 
14910 14282 
/34004 
670 nex = 
14939 14277 
/34114 
97 0 nex = 
26932 27893 
/119203 
310 nex = 
31113 31414 
/4G043 
2092 nex = 



33181 
34475 



34387 
35272 



/17422 
1694 



85840 
85984 
86150 
86316 
86490 
86710 
86925 
87172 



nex = 

85605 
85923 
86064 
86236 
86395 
86576 
86803 
87000 



739349 

13 90 nex = 

45745 47127 

/118505 

1236 nex = 

54662 54394 

54808 54746 

54939 54898 

55250 55035 

55629 55345 

/37655 



Reference No. 2750-942P 



len ■ 



284 8 nex ■■ 



Term 
Intr 
Intr 
Intr 
Intr 
Intr 
Intr 
Intr 
Intr 
Intr 
Init 

>3582315 
len = 
Sngl 

>3582315 

len = 

Init 
Intr 
Intr 
Intr 
Intr 
Term 

>3582315 
len = 
Sngl 

>3582315 

len = 

Init 
Term 

>3600029 

len = 

Sngl 

>3600029 

len = 

Sngl 

>3600045 



73586 73293 

73848 73771 

74084 73948 

74268 74199 

74454 74368 

74621 74571 

74888 74761 

75233 75148 

75468 75317 

75678 75563 

76140 75987 

/29796 
1556 nex = 
45020 43465 

/34214 

1941 nex = 

46259 46498 

46588 46734 

46826 46878 

46958 47120 

47207 47435 

47518 48199 

/12086 

651 nex = 

51096 50446 

/36220 

1908 nex = 

64549 65215 
66042 66456 

/26159 

390 nex = 

24328 23939 

/35159 

1179 nex = 

25177 23999 

/24130 



6 0 len = 



1030 nex = 



3 



Reference No. 2750-942P 



Term 
Intr 
Init 



15293 14783 
15574 15428 
15807 15656 



>3600045 
len = 
Sngl 

>3600045 
len = 



/32602 
1450 nex = 
25383 26825 
725988 
824 nex = 



Term 
Init 



3124 2675 
3498 3344 



>3600045 
len = 



/22850 
408 nex 



Term 
Init 



>3600D45 
len = 



41689 41456 
41863 41773 



735662 
1570 nex 



Term 
Intr 
Init 



51918 51401 
52566 51996 
52969 52885 



>3608126 
len = 



75901 
1296 ne 



Term 
Intr 
Init 



11196 10881 
11486 11310 
12176 11888 



>3608126 
len = 



73992 
134 9 nex ■■ 



Term 
Intr 
Intr 
Intr 
Init 



26268 
26428 
26650 
27078 
27343 



26001 
26385 
26563 
27031 
27172 



>3608126 
len = 



79670 
1059 ne 



Term 
Init 



1939 1719 
2777 2477 



>3608126 

60 



717424 



Reference No. 2750-942P 



2011 nex ^ 



Term 
Intr 
Intr 
Intr 
Intr 
Intr 
Intr 
Init 



33042 
33307 
33518 
33658 
33803 
34105 
34389 
34808 



32798 
33131 
33400 
33592 
33742 
33911 
34199 
34562 



>3608126 
len = 



Init 
Intr 
Term 



34965 35667 
35839 36034 
36222 36633 



>3608126 

len = 

Init 
Intr 
Term 

>3608126 

len = 

Init 
Intr 
Intr 
Intr 
Intr 
Term 



39040 39422 
39532 39576 
39671 39912 



3702 

4194 
5595 
6594 
7084 
7302 
7672 



4297 
5680 
6973 
7213 
7588 
7895 



>3608126 



len = 

Term 
Intr 
Intr 
Init 



814 

56920 
57180 
57415 
57604 



nex = 

56791 
57010 
57327 
57495 



>3608126 

len = 

Term 
Intr 
Intr 
Init 



/24194 

1115 nex = 

56920 56626 

57180 57010 

57415 57327 

57740 57495 



/23672 



len = 

60 



2710 nex = 



6 



Reference No. 2750-942P 





Term 


779 


523 






Intr 


995 


869 






Intr 


1176 


1079 


: 




Intr 


1629 


1462 


- 




Intr 


2092 


1983 






Init 


2233 


2181 






>3 617 74 0 


/14534 




10 


len = 


2050 


nex = 


5 




Init 


38982 


39082 


+ 




Intr 


39269 


39379 


+ 




Intr 


39472 


39776 


+ 


15 


Intr 


39865 


40274 






Term 


40379 


40796 






>3617740 


/17803 




20 


len = 


878 


nex = 


4 




Init 


38750 


38796 








38982 


39082 






I'^t^ 


39269 


39379 




25 


Term 


39472 


39614 






>3617740 


/125212 






len = 


613 


nex = 


2 


30 












Init 


40232 


40274 






Term 


40379 


40844 






>3617740 


/4609 




35 












len = 


3370 


nex = 


14 




Term 


47827 


47264 






Intr 


48084 


47941 




4 0 


Intr 


48277 


48170 






Intr 


48449 


48436 






Intr 


48556 


48515 






Intr 


48691 


48640 






Intr 


48810 


48769 


_ 


45 


Intr 


48971 


48922 


- 




Intr 


49144 


49080 






Intr 


49457 


49334 






Intr 


49729 


49648 






Intr 


49877 


49801 


- 


5 0 




50044 


49973 






Init 


50199 


50135 






>3641835 


/124015 




55 


len = 


586 


nex = 


2 




Term 


104751 


104680 






Init 


105265 


105103 





60 >3641835 



724998 



Reference No. 



2750-942P 



len = 

Term 
5 Intr 
Init 

>3641835 

10 len = 

Term 
Intr 
Init 

15 

>3641835 

2 0 Sngl 

>3641835 
len = 

25 

Term 
Intr 
Intr 
Intr 

3D Intr 
Init 

>3641835 

3 5 len = 

Term 
Intr 
Init 

40 

>3641835 
len = 

4 5 Term 

Intr 
Init 

>3641835 

50 

len = 
Sngl 

55 >364I835 
len = 
Sngl 

60 



1218 

1358 nex = 3 



104368 103951 - 0 

104751 104466 - 0 

105308 105103 - 0 

/40208 

1313 nex = 3 

106166 106021 - 0 

106634 106461 - 0 

107333 106949 - 0 

/10510 

69 2 nex = 1 

118544 117853 - 0 

729665 

2207 nex = 6 

13437 13279 - 0 

13643 13530 - 0 

14090 13733 - 0 

14763 14339 - 0 

15097 15003 - 0 

15485 15170 - 0 

/101430 

6 00 nex = 3 

19477 19405 - 0 

19845 19550 - 0 

20004 19937 - 0 

78586 

1191 nex = 3 

40504 40292 - 0 

41034 40900 - 0 

41253 41179 - 0 

740231 

114 0 nex = 1 

57915 59054 + 0 

76475 

460 nex = 1 

58753 59203 + 0 



Reference No. 2750-942P 



>3641835 
len = 



Term 
Intr 
Init 



71259 70935 
71875 71830 
72127 71973 



Term 
Intr 

15 Init 



71259 70897 
71875 71830 
72167 71973 



>3641835 
len 

2 0 

Sngl 
>3641835 
2 5 len = 



73741 72909 



Sngl 
>3641835 
len = 



73759 72963 



3595 



nex 



Init 
Intr 

35 Intr 
Intr 
Intr 
Intr 
Intr 

4 0 Intr 
Intr 
Intr 
Intr 
Term 

45 

>3641835 



74327 
75055 
75428 
75665 
76009 
76267 
76474 
76796 
77032 
77218 
77476 
77612 



74696 
75097 
75503 
75734 
76158 
76371 
76620 
76871 
77120 
77309 
77518 
77921 



len ■ 



2501 



Init 
Intr 
Intr 
Intr 
Intr 
Intr 
Intr 
Intr 
Intr 
Intr 
Intr 



78502 
78669 
78845 
79022 
79225 
79399 
79650 
79835 
80095 
80300 
80465 



78584 
78711 
78934 
79097 
79294 
79548 
79754 
79981 
80170 
80388 
80556 



Reference No. 2750-942P 



Intr 
Term 



80724 80766 
80863 80994 



>3641835 
len = 



3190 nex ■■ 



Term 
Intr 
Intr 
Intr 
Intr 
Intr 
Intr 
Intr 
Intr 
Init 



81717 
81922 
82305 
82488 
82779 
83016 
83421 
83600 
83792 
84448 



81263 
81848 
82153 
82426 
82579 
82915 
83270 
83531 
83679 
84248 



1363 nex 



Init 
Intr 
Intr 
Intr 
Term 

>3641835 

len = 



85011 85116 

85354 85491 

85562 85747 

85840 86040 

86137 86234 

/24125 

578 nex = 



Init 
Term 



>3641835 
len = 



85840 86040 
86137 86267 



/27838 
1964 nex 



Init 
Intr 
Intr 
Term 



92854 
93122 
93954 
94181 



93018 
93163 
94035 
94236 



>3641835 
len = 



/18715 
2 001 nex ■■ 



Init 
Intr 
Intr 
Term 



92859 
93122 
93954 
94181 



93018 
93163 
94035 
94236 



>3641835 
len = 
Sngl 



/22619 
837 nex = 
97080 96244 



60 >3643588 



/7152 



Reference No. 2750-942P 



len = 

Sngl 

>3643588 

len = 

Init 
Intr 
Intr 
Intr 
Intr 
Intr 
Intr 
Intr 
Intr 
Term 

>3643588 

len = 

Sngl 

>3643588 

len = 

Sngl 

>3643588 

len = 

Sngl 

>3643588 

len = 

Sngl 

>3650026 

len = 

Sngl 

>3650026 

len = 

Term 
Init 

>3650026 

len = 



700 nex = 
100213 100912 
726655 
2932 nex = 



23965 
24558 
24818 
25037 
25184 
25536 
25690 
26128 
26350 
26519 



24029 
24733 
24949 
25087 
25280 
25582 
25909 
26257 
26392 
26896 



739893 

1292 nex = 

34172 32881 

/33811 

619 nex = 

38584 37966 

/I4229 

88 nex = 

44849 44762 

/107487 

688 nex = 

91073 91760 

/105053 

109 6 nex = 

15351 15025 

/41621 

1450 nex = 

14294 13985 
15430 15025 

/98613 

103 nex = 



Reference No. 2750-942P 



Sngl 
>3650026 
len = 

Sngl 
>3650026 

Sngl 

>3650025 

len = 

Term 
Intr 
Intr 
Intr 
Init 



20145 20247 

/1845 

1078 nex = 

25288 25744 

/105106 

430 nex = 

25301 25714 

/18430 

2276 nex = 

27554 27148 

27977 27733 

28182 28075 

28573 28309 

29423 29071 



len = 

Term 
Intr 
Intr 
Intr 
Intr 
Intr 
Intr 
Intr 
Intr 
Intr 
Intr 
Intr 
Init 

>3659491 

len = 

Init 
Intr 
Intr 
Term 

>3659491 

len = 

Sngl 



4510 

9461 
9645 
9862 
10492 
10829 
11104 
11365 
11551 
11745 
12001 
12392 
12653 
13533 



nex = 

9027 
9554 
9749 
10129 
10696 
10996 
11244 
11442 
11641 
11843 
12337 
12602 
13069 



1515 nex = 

17945 18140 

18495 18596 

18711 18833 

19154 19459 

/11713 

647 nex = 

29537 28891 



>3659491 

60 



/1822 



Reference No. 2750-942P 



len 



1542 nex ■■ 



Term 
Intr 
Intr 
Init 

>3659491 

len = 

Term 
Init 

>3659491 

len = 

Init 
Intr 
Intr 
Term 

>3659491 

len = 

Init 
Intr 
Term 

>3659491 

len = 

Init 
Intr 
Term 

>3659491 

len - 

Sngl 

>3659491 

len = 

Sngl 

>3659491 

len = 

Init 
Intr 
Term 



30282 29975 

30669 30547 

30870 30769 

31516 31375 

/16879 

792 nex = 

43340 43008 
43799 43512 

/31696 

14 8 8 nex = 

52934 53129 

53228 53290 

54065 54124 

54224 54421 

/3853 

1674 nex = 

67957 68461 
68545 68737 
69218 69630 

/22630 

1510 nex = 

67989 68461 
68545 68737 
69218 69495 

/119226 

1460 nex = 

80172 80692 

736272 

13 9 8 nex = 

80172 80692 

/3176B 

1795 nex = 

82901 82956 
83626 83993 
84259 84695 



60 >3659491 



/1530 



Reference No. 2750-942P 



Term 
Init 



86548 85965 
87077 86859 



>3659491 
len = 
Sngl 

>3668073 

Len = 

Init 
Intr 
Intr 
Term 

>3668073 
len = 
Sngl 

>3668073 

len = 

Init 
Intr 
Intr 
Intr 
Intr 
Term 



/17503 
270 nex = 
9313 9044 

/16622 

1358 nex = 

23250 23580 

23767 23864 

23960 24025 

24310 24607 

737897 
1111 nex = 
51343 52453 

/37026 
193 0 nex = 



694 
1152 
1494 
1741 
1915 
2221 



789 
1397 
1661 
1834 
2085 
2618 



>3687221 
len = 



/16515 
1589 nex 



Init 
Intr 
Term 

>3687221 

len = 



102216 102407 
102664 102801 
103376 103791 

79466 

1096 nex = 



Term 
Intr 
Init 



107635 107345 
108241 108104 
108440 108355 



>3687221 
len = 



714718 
149 8 nex 



60 



Sngl 121547 121349 



0 



Reference No. 2750-942P 



len ■■ 



nex 



Init 
intr 
Intr 
Intr 
Term 

>3687221 

len = 

Term 
Intr 
Intr 
Init 



Term 
Intr 
Intr 
Intr 
Intr 
Intr 
Intr 
Init 

>3688169 

len = 

Init 
Intr 
Intr 
Term 

>3688169 

len = 

Init 
Intr 
Intr 
Term 

>3688169 

len = 

Sngl 

>3688169 



9152-5 
91730 
91913 
92096 
92263 



91626 
91799 
91991 
92177 
92497 



1290 nex = 

95134 94875 

95305 95237 

95573 95487 

96164 95953 

736835 

2308 nex = 



10054 
10172 
10479 
10683 
10809 
10942 
11425 
12081 



9794 
10133 
10407 
10607 
10773 
10882 
11225 
11998 



739326 

1424 nex = 

16580 16851 

16934 16974 

17077 17132 

17802 18003 

78739 

1450 nex = 

29897 30039 

30457 30564 

30834 31001 

31120 31345 

795066 

850 nex = 

6958 7799 

74868 



len = 

60 



59 0 nex = 



1 



Reference No. 2750-942P 



Sngl 

>3695372 

len = 

init 
Intr 
Intr 
Intr 
Intr 
Intr 
Intr 
Intr 
Intr 
Intr 
Term 

>3695372 

len = 

Sngl 

>3695386 

len = 

Sngl 

>3695400 

len = 

Sngl 

>3695400 

len = 

Term 
Init 

>3695400 

len = 

Init 
Term 

>3695400 

len = 

Sngl 

>3695400 



72372 71783 
725253 
2094 nex = 



1783 
2177 
2345 
2533 
2705 
2854 
3017 
3178 
3365 
3510 
3708 



2087 
2257 
2440 
2616 
2774 
2936 
3096 
3276 
3434 
3599 
3876 



/20974 

9399 nex = 

20229 19834 

724263 

711 nex = 

19462 20172 

/34605 

4 62 nex = 

10570 11031 

/1489 

1587 nex = 

15637 15331 
16917 16616 

/10189 

9 85 nex = 



25271 
25943 



25588 
26252 



/1831 
477 nex = 
26645 26169 
/35152 



len 

60 



455 nex = 



1 



Reference No. 2750-942P 



Sngl 
>3695400 
len = 



Init 
Intr 
Term 



3404 3506 
3589 3680 
3749 4032 



>3695400 
len = 



Term 
Init 



5925 5502 
7023 6790 



>3695400 



2144 



Term 
Intr 
Intr 
Intr 
Init 



77122 76779 

77397 77265 

77620 77515 

77799 77729 

78922 78765 



/32891 



2096 



Init 
Intr 
Intr 
Intr 
Intr 
Intr 
Intr 
Intr 
Term 



15705 
15919 
16274 
16586 
16845 
17042 
17227 
17401 
17583 



15815 
16138 
16489 
16719 
16916 
17131 
17313 
17480 
17792 



>3702315 
len = 



2082 



nex 



Init 
Intr 
Intr 
Intr 
Intr 
Intr 
Term 

>3702315 

len = 



18147 
18507 
18818 
19279 
19462 
19649 
19837 



18358 
18722 
18951 
19368 
19548 
19728 
20078 



/38881 
2006 nex 



Init 18051 18358 
Intr 18507 18722 
60 Intr 18818 18951 



0 
0 
0 
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Intr 
Intr 
Intr 
Term 



19279 19368 

19462 19548 

19649 19728 

19837 20056 



/126587 



1758 nex 



Init 
Intr 
Intr 
Intr 
Intr 
Intr 
Intr 
Intr 
Term 



37610 
37865 
38002 
38260 
38382 
38543 
38770 
38917 
39096 



37788 
37924 
38107 
38302 
38454 
38630 
38826 
38995 
39367 



20 >3702315 
len = 
Init 

2 5 Intr 

Intr 
Intr 
Intr 
Intr 

3 0 Intr 

Intr 
Term 



1673 

37636 
37865 
38002 
38260 
38382 
38543 
38770 
38917 
39096 



nex = 

37788 
37924 
38107 
38302 
38454 
38630 
38826 
38995 
39308 



len = 

Init 
Intr 
Intr 
Intr 
Intr 
Intr 
Intr 
Term 
>3702315 

len = 



1311 

37685 
37865 
38002 
38260 
38382 
38543 
38770 
38917 



2328 



nex = 

37788 
37924 
38107 
38302 
38454 
38630 
38826 
38989 



nex 



Term 
Intr 
Intr 
Intr 
Intr 
Intr 
Intr 
Init 



39649 
39862 
40012 
40165 
40534 
40776 
40939 
41622 



39295 
39780 
39960 
40081 
40435 
40626 
40857 
41363 



>3702315 

60 



/16400 
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746 nex 



Init 
Term 

>3702315 

len = 

Init 
Term 

>3702315 

len = 

Sngl 

>3702315 

len = 

Sngl 

>3702315 

len = 

Term 
Init 

>3702315 

len = 

Term 
Init 

>3702315 

len = 

Init 
Intr 
Intr 
Intr 
Intr 
Term 

>3702722 

len = 

Term 
Intr 
Intr 
Intr 
Intr 
Intr 
Intr 



4313 4574 
4732 5058 

722834 

711 nex = 

4314 4711 
4732 5024 

/2011 

815 nex = 

50261 49447 

/150696 

813 nex = 

805 1395 

79694 

1390 nex = 

63455 63085 
64470 64155 

72747 

14 50 nex = 

63455 63044 
64488 64155 

737367 

18 00 nex = 

75775 75964 

76051 76180 

76334 76471 

76560 76892 

76989 77172 

77268 77574 

72662 

2085 nex = 

14429 14277 

14686 14527 

14858 14767 

15138 14935 

15415 15221 

15543 15495 

15749 15646 



Reference No. 2750-942P 



Intr 15877 15838 
Init 16361 16286 



len = 

Intr 
Intr 
Intr 
Intr 
Intr 
Intr 
Intr 
Init 



2372 nex ■■ 



14429 
14686 
14858 
15138 
15415 
15543 
15749 
15877 
16437 



14066 
14527 
14767 
14935 
15221 
15495 
15646 
15838 
16286 



>3702722 
20 len = 



159 7 nex 



Terra 
Intr 
Intr 

25 Intr 
Init 



50617 50334 

50852 50691 

51106 51000 

51288 51207 

51930 51764 



>3702722 
3 0 len = 



735228 



2920 



nex = 



Term 
Intr 
Intr 

35 Intr 
Intr 
Init 

>3702722 



62691 62388 

63047 62898 

63282 63083 

63460 63392 

63685 63542 

64731 64543 

/150222 



Term 
Init 



88 5 nex 



67157 66736 
67620 67353 



Term 
Intr 
Intr 
Intr 
Intr 
Intr 
Intr 
Intr 
Intr 
Intr 
Init 



2568 

33692 
33810 
34153 
34894 
35061 
35224 
35373 
35512 
35642 
35787 
35966 



nex = 

33399 
33769 
34055 
34809 
34977 
35141 
35294 
35449 
35574 
35707 
35863 
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>3702723 

len = 

Init 
Term 

>3702724 

len = 

Init 
Intr 
Intr 
Intr 
Intr 
Term 

>3702724 

len = 

Sngl 

>3702724 

len = 

Sngl 

>3702724 

len = 

Init 
Intr 
Intr 
Intr 
Term 

>3702724 

len = 

Init 
Intr 
Intr 
Term 

>3702724 

len = 

Term 
Intr 
Init 



/12930 

673 nex = 

9767 9878 
10190 10439 

/20287 

1517 nex = 

20008 20168 

20256 20311 

20582 20705 

20799 20905 

21133 21196 

21314 21524 

/94685 

259 nex = 

38461 38203 

/33455 

1271 nex = 

39470 38200 

/6342 

2050 nex = 

47979 48280 

48382 48502 

48605 48696 

49127 49223 

49436 50023 

/16773 

1249 nex = 

64138 64368 

64447 64502 

64842 64905 

65169 65386 

/95513 

1636 nex = 

74797 74565 
75178 74949 
75971 75545 



>3702724 

60 



76984 
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Init 
Intr 
Intr 
Intr 
Intr 
Intr 
Intr 
Term 

>3702728 

len = 



3134 

8892 
9352 
9555 
9920 
10237 
10691 
11151 
11334 



9148 
9461 
9825 
10055 
10603 
11059 
11229 
11533 



/207075 
1647 nex 



>3702728 

len = 

Sngl 

>3702728 

len = 

Sngl 

>3702728 

len = 

Init 
Term 

>3702729 

len = 

Sngl 

>3702730 

len = 

Sngl 

>3702731 

len = 

Init 



/5200 

750 nex = 

25422 24673 

/29189 

57 8 nex = 

34532 35109 

/25501 

939 nex = 

39949 40151 
40345 40887 

/7579 

1270 nex = 

28996 29292 

/18256 

614 nex = 

24396 23783 

/37901 

1750 nex = 

15709 16151 



Term 


10490 


10127 


0 


Intr 


10656 


10573 


0 


Intr 


10792 


10748 


0 


Intr 


10942 


10888 


0 


Intr 


11147 


11028 


0 


Intr 


11275 


11224 


0 


Intr 


11456 


11374 


0 


Init 


11773 


11604 


0 



Reference No. 2750-942P 



Term 
>3702731 
len = 
Sngl 
>3702731 
len = 
Sngl 
>3702731 
len = 



17205 17451 



15712 15992 



Term 
Init 



22481 22203 
22709 22575 



>3702731 
len = 



2209 



Term 
Intr 
Intr 
Intr 
Intr 
Intr 
Intr 
Init 



4259 
4544 
4802 
4997 
5223 
5410 
5596 
6189 



3981 
4368 
4628 
4887 
5148 
5337 
5493 
6014 



>3702731 

len = 

Init 
Intr 
Intr 
Intr 
Intr 
Term 

>3702731 

len = 

Init 
Intr 
Intr 
Intr 
Intr 
Intr 
Term 



2086 nex = 

7741 8031 

8293 8355 

8442 8558 

8650 9091 

9265 9416 

9532 9826 

/27099 

2791 nex = 

81625 81712 

82523 82571 

82720 82828 

83013 83106 

83540 83673 

83776 83939 

84172 84415 



798232 



60 len = 



2066 nex = 



2 
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Init 
Term 

>3702732 

len = 

Init 
Term 

>3702732 

len = 

Init 
Term 

>3702732 
len = 
Sngl 

>3702732 

len = 

Term 
Intr 
Intr 
Intr 
Intr 
Intr 
Intr 
Init 

>3702732 
len = 
Sngl 

>3702733 

len = 

Init 
Intr 
Intr 
Intr 
Intr 
Intr 
Intr 
Intr 
Term 



13407 13675 
14812 15472 

/19319 

1177 nex = 

18687 18864 
19260 19863 

/7700 

1035 nex = 

30814 31011 
31131 31848 

724349 

763 nex = 

59677 60439 

/23018 

2089 nex = 



60763 
60927 
61084 
61257 
61475 
61799 
61920 
62226 



60548 
60880 
61010 
61192 
61386 
61590 
61887 
62013 



2383 

11798 
12085 
12405 
12606 
12850 
13106 
13283 
13530 
13738 



nex = 

12008 
12315 
12484 
12773 
13025 
13175 
13437 
13613 
14180 



60 len = 



1653 nex = 
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Init 
Intr 
Intr 
Term 

>3702733 
len = 
Sngl 

>3702734 

len = 

Term 
Intr 
Intr 
Intr 
Intr 
Intr 
Intr 
Intr 
Init 

>3702734 
len = 
Sngl 

>3702734 

len = 

Init 
Intr 
Intr 
Term 

>3702734 
len = 
Sngl 

>3702734 

len = 

Init 
Term 

>3702734 

len = 



24950 25031 

25138 25259 

25830 25907 

25996 26115 

/789 
1531 nex = 
4266 2736 

/19116 
2455 nex = 



9390 
9751 
9915 
10035 
10280 
10491 
10697 

mil 

11472 



9018 
9471 
9844 
10000 
10119 
10387 
10611 
10989 
11255 



/7232 
146 nex = 
3094 3239 

/121024 

1737 nex = 

42608 42742 

43086 43196 

43298 43408 

43484 43846 

/10658 

57 9 nex = 

4938 5516 

/14354 

1181 nex = 



662 
1645 



1137 
1842 



739557 
790 nex 



Sngl 7453 8241 

60 



0 
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len ■ 



2126 



nex 



5 Term 
Intr 
Intr 
Intr 
Intr 

10 Intr 
Init 

>3702735 

15 len = 

Term 
Intr 
Intr 

20 intr 
Intr 
Intr 
Init 

25 >3702735 
len = 
Term 

3 0 Intr 

Intr 
Intr 
Init 

35 >3702735 
len = 
Init 

4 0 Intr 

Intr 
Intr 
Intr 
Intr 

45 Term 



9857 9648 

10069 10001 

10631 10500 

10849 10734 

11016 10935 

11194 11124 

11773 11547 



/29871 



2630 



nex = 



9857 9489 

10069 10001 

10631 10500 

10849 10734 

11016 10935 

11194 11124 

12118 11803 

/30034 

1860 nex = 

12882 12590 

13611 13478 

13786 13700 

14287 13958 

14449 14377 

/1492 

2013 nex = 

15808 15880 

16103 16150 

16235 16282 

16379 16450 

16538 16693 

16788 16891 

16967 17259 



/39954 



2145 



nex ■ 



Term 
Intr 
Intr 
Intr 
Intr 
Intr 
Init 



17743 
17875 
18119 
18361 
18509 
18923 
19546 



17402 
17828 
17961 
18220 
18444 
18605 
19195 



>3702735 

60 



/7451 
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len = 


288 


nex = 


1 






Sngl 


19551 


19264 




0 


5 


>3702735 


/19104 








len = 


2153 


nex = 


7 






Term 


17743 


17402 




0 


10 


Intr 


17875 


17828 


- 


0 




Intr 


18119 


17961 




0 






18361 


18220 




0 




Intr 


18509 


18444 




0 




Intr 


18923 


18605 




0 


15 


Init 


19554 


19195 




0 




>3702735 


737772 








len = 


2185 


nex = 


7 




20 














Term 


17743 


17415 




0 






17875 


17828 




0 




Intr 


18119 


17961 




0 




Intr 


18361 


18220 


- 


0 


25 


Intr 


18509 


18444 




0 






18923 


18605 




0 




Init 
ni 


19600 


19195 




0 






/108736 






_ „ 














len 


2183 


nex = 


7 








17743 


17418 




0 




Intr 


17875 


17828 


- 


0 


35 


Intr 


18119 


17961 




0 




Intr 


18361 


18220 


- 


0 




Intr 


18509 


18444 




0 




Intr 


18923 


18605 


- 


0 




Init 


19600 


19195 




0 


40 














>3702735 


73012 








len 


970 


nex = 


1 




45 


Sngl 


21165 


20204 




0 




>3702735 


716025 








len = 


1114 


nex = 


1 




50 














ng 


28361 


29474 




0 




>3702735 


740827 






55 


len = 


1870 


nex = 


7 






Init 


32059 


32271 


+ 


0 




Intr 


32368 


32648 


+ 


0 




Intr 


32727 


32792 


+ 


0 


60 


Intr 


32884 


33024 


+ 


0 
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Intr 
Intr 
Term 

>3702735 

len = 

Init 
Intr 
Term 

>3702735 

len = 

Init 
Intr 
Intr 
Term 

>3702735 
len = 
Sngl 

>3702735 

len = 

Init 
Intr 
Intr 
Intr 
Intr 
Intr 
Term 

>3702735 

len = 

Term 
Intr 
Intr 
Init 

>3702736 
len = 
Sngl 

>3702736 
len = 



33122 33250 
33351 33494 
33582 33926 

738864 

2110 nex = 

50260 50547 
50645 50722 
50817 51231 

71037 

1950 nex = 

52134 52251 

52369 52437 

52728 52834 

52940 53499 

/22308 

408 nex = 

53268 53675 

713359 

17 50 nex = 



61031 
61431 
61649 
61882 
62033 
62165 
62370 



61277 
61563 
61765 
61947 
62071 
62287 
62773 



726013 
1510 nex 



65133 
65309 
65703 
66162 



64653 
65230 
65393 
65839 



721033 
6 85 nex = 
14295 13611 
722417 
22 4 4 nex = 



Init 22337 22446 
60 Intr 22746 22848 



Reference No. 2750-942P 



intr 
Intr 
Intr 
Intr 
Intr 
Intr 
Term 



22935 
23099 
23309 
23513 
23673 
23978 
24299 



23021 
23226 
23405 
23590 
23898 
24201 
24580 



76639 
14 65 nex ■ 



15 



Init 
Term 



>3702736 



len = 
2 0 Sngl 
>3702736 



2970 3755 
3857 4434 

/150178 

4 95 nex = 

41469 41963 

799738 



25 



len = 

Sngl 

>3702737 

3 0 len = 

Term 
Intr 
Intr 

3 5 Intr 
Intr 
Intr 
Intr 
Intr 

40 Intr 
Intr 
Intr 
Init 

45 >3702737 
len = 
Term 

5 0 Intr 
Intr 
Intr 
Intr 
Intr 

55 Intr 
Intr 
Intr 
Intr 
Intr 

60 Intr 



551 nex = 
48180 48730 
735235 



2470 

23292 
23486 
23671 
23913 
24083 
24325 
24509 
24649 
24929 
25118 
25399 
25529 



2968 

23292 
23486 
23671 
23913 
24083 
24325 
24509 
24649 
24929 
25118 
25399 
25622 



nex = 

23060 
23403 
23567 
23788 
24012 
24250 
24468 
24594 
24816 
25065 
25274 
25486 



nex = 

23058 
23403 
23567 
23788 
24012 
24250 
24468 
24594 
24816 
25065 
25274 
25486 
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Init 

>3702737 

len = 

Init 
Intr 
Intr 
Intr 
Term 

>3702737 

len = 

Term 
Init 

>3702738 

len = 

Init 
Intr 
Term 

>3702739 

len = 

Sngl 

>3702739 

len = 

Sngl 

>3738088 

len = 

Term 
Intr 
Intr 
Intr 
Intr 
Intr 
Intr 
Intr 
Init 

>3738275 

len = 



26025 25713 
/41468 



1214 

61712 
61844 
62113 
62277 
62489 



nex = 

61761 
62014 
62182 
62397 
62925 



725523 

127 0 nex = 

9043 8460 
9725 9132 

73619 

1734 nex = 

8288 8505 
9426 9614 
9689 10021 

738027 

342 nex = 

342 1 

711852 

1734 nex = 

83965 83716 

729306 

2416 



21390 
21586 
21773 
21894 
22056 
22222 
22628 
23079 
23466 



nex = 

21051 
21492 
21684 
21859 
21973 
22133 
22314 
22864 
23181 



714182 
975 nex ■ 



Init 18779 18953 
Term 19387 19753 

60 
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1241 

>3738275 /8386 





len = 


1164 


nex = 


4 




5 




22568 


22293 




0 




Intr 


22774 


22653 




0 




Intr 


23106 


23012 




0 




Init 


23456 


23209 




0 


1 0 


>3 73827 5 


739367 








len = 


1990 




8 






Term 


38894 


38400 


- 


0 


15 


Intr 


39077 


38993 




0 




Intr 


39243 


39152 


- 


0 




Intr 


39444 


39334 




0 




Intr 


39608 


39534 


- 


0 




Intr 


39848 


39753 


- 


0 


2 0 




40152 


39939 




0 




Init 


40384 


40234 




0 




>3738275 


/21006 






9 R 


len 


619 


nex = 








Sngl 


72476 


73094 


+ 


0 




>3738275 


/39915 






30 














len = 


1487 


nex = 


4 








75590 


76151 




0 




Intr 


76230 


76391 


+ 


0 


35 


Intr 


76475 


76746 


+ 


0 




Term 


76819 


77076 


+ 


0 




>3738275 


/10341 








len 


929 


nex = 








Init 


82166 


82372 


+ 


0 




Term 


82451 


83094 


+ 


0 


45 


>3738275 


/5414 








len = 


699 


nex = 


1 






Sngl 


85117 


84419 


- 


0 


50 














>3738275 


/31153 








len = 


587 


nex = 


1 




55 


Sngl 


91661 


92247 


+ 


0 




>3738275 


77887 








len = 


1700 


nex = 


2 





60 
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Init 
Term 



96923 97953 
98161 98622 



>3738313 
len = 
Sngl 

>3738313 
len = 



/11227 
1048 nex = 
31224 30177 
/14133 



2123 



nex 



Init 
Intr 
Intr 
Intr 
Intr 
Term 



40926 
41351 
41614 
42263 
42517 
42788 



41117 
41508 
41805 
42416 
42709 
43048 



976 



nex 



Init 
Intr 
Intr 
Term 



45374 45479 

45584 45606 

45853 45974 

46056 46332 



>3738313 

len = 

Init 
Intr 
Intr 
Intr 
Term 

>3738313 

len = 



/13580 



1060 nex ■■ 



47214 
47473 
47779 
47946 
48097 



47374 
47666 
47863 
48011 
48273 



/28044 
3250 nex = 



Term 
Intr 
Intr 
Intr 
Intr 
Intr 
Intr 
Intr 
Intr 
Intr 
Init 

>3738313 

len = 



50661 
51114 
51543 
51756 
51950 
52089 
52236 
52471 
52634 
52848 
53400 



50155 
50735 
51199 
51628 
51850 
52030 
52165 
52328 
52563 
52716 
52959 



/11327 
2590 nex ■■ 



60 



Term 



65078 64374 



0 
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Intr 65397 

Intr 65607 

Intr 66003 

Intr 66278 

Init 66956 



65161 
65496 
65690 
66114 
66369 



Init 
Intr 
Intr 
Intr 
Intr 
Intr 
Intr 
Intr 
Term 

>3746057 
len = 
Sngl 

>3746057 

len = 

Init 
Intr 
Intr 
Intr 
Term 

>3746057 

len = 

Term 
Init 

>3757512 

len = 

Term 
Intr 
Init 



2170 nex = 

92485 92831 

92913 92952 

93039 93098 

93463 93515 

93603 93652 

93744 93777 

93875 93947 

94031 94123 

94234 94650 

721261 

1285 nex = 

15904 14620 

736533 

3692 nex = 

27102 27418 

27863 28034 

29868 30043 

30142 30300 

30381 30793 

/10986 

5 74 nex = 

32722 32483 

33056 32809 

734622 

922 nex = 

11574 11216 

11801 11665 

12137 11918 



>3757512 
len = 



Term 
Init 



7156725 
79 0 nex ■■ 



13186 13046 
13828 13761 



>3757512 

60 



736208 
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Terun 


11574 


11127 


0 


Intr 


11801 


11665 


0 


Intr 


12158 


11918 


0 


Intr 


12416 


12250 


0 


Init 


13186 


13044 


0 



>3757512 

len = 

Init 
Intr 
Intr 
Term 

>3757512 

len = 

Init 
Intr 
Term 

>3757512 

len = 

Init 
Intr 
Term 

>3757512 

len = 

Init 
Intr 
Term 

>3757512 

len = 

Term 
Intr 
Init 



/17Q67 

1177 nex = 

48329 48418 

48697 48796 

48877 49001 

49084 49505 

/1629 

1161 nex = 

48697 48796 
48877 49001 
49084 49491 

734897 

1056 nex = 

48697 48796 
48877 49001 
49084 49445 

/103391 

807 nex = 

48697 48796 
48877 49001 
49084 49503 

/10206 

730 nex = 

50124 49809 
50347 50233 
50535 50445 



>3757512 

len = 

Term 
Intr 
Intr 
Intr 
Intr 
Intr 
Intr 



4768 nex = 

50124 49849 

50347 50233 

50533 50445 

51384 51295 

51786 51712 

51965 51882 

52405 52174 
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5 



Intr 


52663 


52608 


Intr 


52821 


52765 


Intr 


52984 


52903 


Intr 


54261 


53779 


Init 


54612 


54457 



>3763915 739279 

len = 1270 nex = 

10 

Term 19272 18719 

Init 19986 19634 

>3763915 /20820 

15 

len = 2055 nex = 

Init 45580 45662 

Intr 45753 46015 

20 Intr 46153 46221 

Intr 46358 46416 

Term 46586 46687 

>3763915 /12017 

25 

len = 2775 nex = 

Init 50003 50101 

Intr 50315 50367 

30 Intr 50890 50957 

Intr 51035 51253 

Intr 51335 51863 

Intr 52052 52294 

Term 52385 52777 

35 

>3763915 734599 





len = 


1286 


nex = 


40 


Init 


56203 


56557 




Intr 


56631 


56904 




Intr 


56983 


57115 




Intr 


57204 


57290 




Term 


57378 


57488 


45 










>3763915 


7311 




len = 


4030 


nex = 


50 


Term 


60393 


60334 




Intr 


60545 


60500 




Intr 


60724 


60645 




Intr 


60942 


60809 




Intr 


61216 


61109 


55 


Intr 


61408 


61339 




Intr 


61923 


61846 




Intr 


62172 


62050 




Intr 


62408 


62304 




Intr 


63134 


63057 


60 


Intr 


63482 


63426 



Reference No. 2750-942P 



Intr 
Init 

>3763915 

5 

len = 

Init 
Term 

10 

>3763915 

len = 

15 Term 
Intr 
Init 

>3763944 

20 

len = 

Init 
Intr 

25 Intr 
Term 

>3763944 

3 0 len = 

Init 
Intr 
Intr 

35 Intr 
Term 

>3763944 

4 0 len = 

Init 
Intr 
Intr 

45 Intr 
Intr 
Term 

>3763944 

50 

len = 

Init 
Intr 

55 Intr 
Intr 
Intr 
Term 



63977 63861 

64363 64190 

/31507 

1402 nex = 

7990 8335 

9245 9391 

/18447 

1714 nex = 

87259 87159 

87414 87348 

87961 87899 

78742 

1230 nex = 

15958 17118 

17220 17330 

17589 17767 

17874 18187 

736974 

1990 nex = 

1865 2006 

2096 2209 

2304 2531 

2952 3391 

3484 3851 

78686 

1578 nex = 

19154 19555 

19635 19679 

19765 19837 

19939 20036 

20136 20206 

20399 20731 

719199 

1228 nex = 

19315 19555 

19635 19679 

19765 19837 

19939 20036 

20136 20206 

20399 20542 



60 >3763944 



79640 



Reference No. 2750-942P 



Init 
Term 



3064 3391 
3484 3835 



>3763944 
len = 



2252 



Term 
Intr 
Intr 
Intr 
Intr 
Intr 
Init 



33666 33350 

34075 33766 

34206 34161 

34967 34896 

35185 35059 

35397 35293 

35601 35504 



>3763944 

len = 

Term 
Intr 
Intr 
Intr 
Intr 
Init 

>3763944 
len = 
Sngl 

>3763944 

len = 

Init 
Intr 
Intr 
Intr 
Intr 
Term 

>3763944 

len = 

Term 
Intr 
Init 



/3321 



1486 nex 



48162 
48339 
48651 
48901 
49077 
49575 



48090 
48287 
48603 
48837 
48990 
49178 



/35300 
33 9 nex = 
54201 53863 
/37313 
2 08 7 nex = 



65398 
65729 
66198 
66459 
66803 
67215 



55640 
65831 
66283 
66697 
66992 
67484 



/16451 

1295 nex = 

78903 78431 
79217 79086 
79720 79611 



>3763944 
len = 



Term 78903 78426 
Intr 79217 79086 



Reference No. 2750-942P 



79807 79611 



len = 
Sngl 
>3766106 
len = 
Sngl 
>3766106 



/43034 
38 6 nex = 
79807 79611 
/122624 
190 nex = 
13182 12996 
/17815 



4 7 06 nex 



Term 
Intr 
Intr 
Intr 
Intr 
Intr 
Intr 
Intr 
Intr 
Intr 
Intr 
Intr 
Intr 
Intr 
Intr 
Intr 
Intr 
Intr 
Init 



13428 
13611 
13786 
13982 
14288 
14458 
14840 
15050 
15272 
15413 
15600 
15756 
15905 
16194 
16361 
16560 
16723 
16859 
17096 



12997 
13523 
13708 
13885 
14184 
14372 
14646 
14944 
15140 
15351 
15499 
15685 
15849 
16093 
16324 
16449 
16667 
16804 
16970 



>3766106 
len = 
Sngl 

>3766106 

len = 

Term 
Intr 
Intr 
Intr 
Intr 
Intr 
Init 



/31830 
14 68 nex = 
21312 20870 
72763 
1609 nex = 



25723 
25940 
26117 
26241 
26573 
26960 
27166 



25696 
25843 
26044 
26200 
26515 
26874 
27008 



len = 

60 



1931 nex = 



Reference No. 2750-942P 





Term 


25723 


25503 


- 


0 




Intr 


25940 


25843 




0 




Intr 


26117 


26044 


- 


0 




Intr 


26241 


26200 






5 


Intr 


26573 


26515 


- 


0 




Intr 


26960 


26874 




0 




Intr 


27166 


27139 




0 




Init 


27433 


27265 


_ 


0 


10 


>3766106 


/772 








len = 


1100 


nex = 


3 






Init 


55679 


55811 




0 


15 


Intr 


55892 


56190 


+ 


0 




Term 


56287 


56543 


+ 


0 




>3766I06 


/42210 






20 


len = 


1404 


nex = 


4 






Term 


4966 


4948 




0 




Intr 


5280 


5077 


_ 


0 




Intr 


5648 


5451 


_ 


0 


25 


Init 


6351 


6198 


_ 


0 




>3766106 


737467 








len = 


3610 


nex = 


15 




3 0 














Init 


6698 


6918 


+ 


0 




Intr 


7000 


7028 


+ 


0 




Intr 


7114 


7151 


+ 


0 




Intr 


7313 


7400 


+ 


0 


35 


Intr 


7670 


7775 


+ 


0 




Intr 


7890 


7960 


+ 


0 




Intr 


8194 


8263 


+ 


0 




Intr 


8365 


8427 


+ 


0 




Intr 


8718 


8789 


+ 


0 


40 


Intr 


8898 


8999 


+ 


0 




Intr 


9219 


9293 




0 




Intr 


9415 


9513 


+ 


0 




Intr 


9747 


9863 


+ 


0 






9947 


10003 




0 


45 


Term 


10101 


10306 


+ 


0 




>3766 106 


/31765 








len = 


1407 


nex = 


5 




50 














Term 


82104 


82012 








Intr 


82327 


82239 




0 




Intr 


82516 


82454 




0 




Intr 


82689 


82600 




0 


55 


Init 


82977 


82917 




0 




>3766106 


/32117 








len = 


2329 




9 





Reference No. 2750-942P 



Term 


81812 


81701 




0 


Intr 


82104 


82012 




0 


Intr 


82327 


82239 




0 


Intr 


82516 


82454 




0 


Intr 


82689 


82600 




0 


Intr 


82977 


82917 




0 


Intr 


83164 


83070 




0 


Intr 


83370 


83281 




0 


Init 


84029 


83633 




0 



Term 
Intr 
Init 

>3785968 

len = 

Init 
Intr 
Intr 
Intr 
Intr 
Intr 
Intr 
Intr 
Term 

>3785968 

len = 

Term 
Intr 
Intr 
Init 

>3785968 
len = 
Sngl 

>3785968 

len = 

Init 
Term 

>3785968 

len = 



43560 43284 
43921 43879 
44211 44021 



2151 

47820 
48026 
48227 
48637 
48785 
49026 
49271 
49514 
49702 



nex = 

47930 
48147 
48287 
48705 
48903 
49173 
49415 
49600 
49962 



/18820 

2037 nex = 

51353 50716 

51853 51786 

52406 52062 

52752 52479 

711949 

1210 nex = 

56312 55108 

735997 

1090 nex = 

62489 62738 
63173 63578 

/17603 

1210 nex = 



Init 62489 62738 
60 Term 63173 63691 



Reference No. 2750-942P 



len 


956 


nex = 


2 




Init 


832 


994 




0 


Term 


1320 


1787 


+ 


0 


>3785968 


/32861 






len = 


1930 


nex = 


5 




Init 


97394 


97474 




0 




97567 


97743 




0 


Intr 


97843 


97921 




0 




98036 


98151 




0 


Term 


98536 


99144 


_^ 


0 


>3785992 


73294 






len = 


1600 


nex = 


6 




Term 


14853 


14529 


- 


0 


Intr 


15039 


14947 






Intr 


15343 


15131 






Intr 


15667 


15434 






Intr 


15888 


15743 






Init 


16128 


15979 






>3785992 


/32721 






len = 


^ 2131 


nex = 


6 




Init 


18649 


18757 


+ 


0 


Intr 


18865 


19046 








19570 


19788 






Intr 


19870 


20091 




0 


Intr 


20183 


20254 


+ 


0 


Term 


20343 


20779 


+ 


0 


>3785992 


/40283 






len = 


1785 


nex = 


4 




Init 


21029 


21179 






Intr 


21550 


21664 








21810 


21961 






Term 


22069 


22813 


+ 


0 


>3785992 


/17861 






len = 


2802 


nex = 


5 




Init 


30317 


30479 


+ 


0 


Intr 


30906 


31025 


+ 


0 


Intr 


31867 


31981 


+ 


0 


Intr 


32237 


32330 


+ 


0 


Term 


32412 


32768 


+ 


0 



60 >3785992 



735493 



Reference No. 2750-942P 





len 


3170 


nex 








Init 


33260 


33564 


+ 


0 


5 


intr 


33721 


33919 


+ 


0 






34406 


34469 




0 




Intr 


34703 


34847 




0 




Intr 


35156 


35242 


+ 


0 




Intr 


35398 


35555 


+ 


0 




Intr 


35655 


35703 








Intr 


35816 


35940 




0 




Term 


36021 


36429 




0 






737377 






1 R 














en 


2858 


nex = 


10 






Init 


33355 


33564 








Intr 


33721 


33919 


+ 


0 


20 


Intr 


34406 


34469 


+ 


0 




Intr 


34703 


34847 










35156 


35242 








Intr 


35398 


35555 


+ 


0 




Intr 


35655 


35703 


+ 


0 


25 


Intr 


35816 


35940 








Intr 


36021 


36123 








Term 


36144 


36212 








>3785992 


/21746 






3 0 














len = 


2965 


nex = 








Term 


45531 


45288 


_ 


0 




Intr 


45743 


45645 




0 


35 


Intr 


45891 


45826 


: 


0 




Intr 


46097 


45977 


- 


0 




Intr 


46418 


46351 




0 




Intr 


46701 


46630 




0 




Intr 


47414 


47230 


_ 


0 


40 


Intr 


47749 


47514 


- 


0 




Init 


48252 


47839 








>3785992 


/101298 






45 


len = 


250 


nex = 








Sngl 


82983 


82739 








>3785992 


725626 






50 














len = 


2398 


nex = 


6 






Term 


83028 


82729 




0 




Intr 


83330 


83293 




0 


55 


Intr 


83469 


83420 




0 




Intr 


83997 


83830 




0 




Intr 


84473 


84083 




0 




Init 


85126 


84720 




0 



60 >3785992 



/18328 



Reference No. 2750-942P 



len = 


730 nex = 


2 




Term 


88201 87996 


- 


0 


Init 


88716 88313 




0 


>3789706 


/18060 






len = 


13 7 5 nex = 


2 




Init 


17339 17476 




0 


Term 


18116 18387 


+ 


0 


>3789706 


/4046 1 






len = 


1943 nex = 


7 




Init 


41058 41294 


+ 


0 


Intr 


41427 41497 


+ 


0 


Intr 


41746 41854 


+ 


0 


Intr 


41986 42068 


+ 


0 


Intr 


42216 42374 


+ 


0 


Intr 


42466 42567 


+ 


0 


Term 


42665 43000 


+ 


0 


>3789706 


/12454 






len = 


1482 nex = 


1 




Sngl 


49272 47791 


_ 


0 


>3789706 


/36143 






len = 


1053 nex = 


1 




Sngl 


66764 65712 


_ 


0 


>37 89 7 06 


3 






en 


6 95 nex 
nex 














len 


2297 nex 






Term 


16983 16533 




0 


Intr 


17156 17087 


_ 


0 


Intr 


17682 17566 




0 




17899 17800 




0 


Intr 


18015 17992 




0 


Intr 


18268 18128 




0 


Intr 


18613 18591 




0 


Init 


18829 18688 




0 


>3805839 


/119300 






len = 


1606 nex = 


5 





Init 44348 44802 
60 Intr 44996 45173 



Reference No. 2750-942P 





Intr 


45268 


45351 


+ 


0 




Intr 


45442 


45609 


+ 


0 






45706 


45953 


+ 


0 


5 


>3805839 


75482 








len = 


3370 


nex = 


12 






Init 


47368 


47506 


+ 


0 


10 




47679 


47841 


+ 


0 




Intr 


48042 


48107 


+ 


0 




Intr 


48209 


48294 


+ 


0 




Intr 


48479 


48596 


+ 


0 




Intr 


48680 


48886 


+ 


0 


15 


Intr 


49008 


49084 


+ 


0 




Intr 


49250 


49343 


+ 


0 






49606 


49671 




0 




Intr 


49774 


49851 


+ 


0 




Intr 


49955 


50056 


+ 


0 


2 0 


Term 


50145 


50455 




0 




>3805839 


/17725 








len = 


2056 


nex = 


6 




25 














Term 


50772 


50483 


- 


0 




Intr 


50949 


50881 




0 




Intr 


51247 


51043 




0 




Intr 


51692 


51610 


- 


0 


3 0 


Intr 


52241 


52102 








ni 


52538 


52356 




0 




>3 80583 9 


/1095 






35 


len = 


1127 


nex = 


3 






Term 


70923 


70590 




0 






71436 


71382 




0 




Init 


71716 


71529 




0 


40 














>3805839 


78360 








len 


1095 


nex = 


3 




45 


Term 


94509 


94155 


- 


0 






94708 


94595 




0 




Init 


95249 


95131 




0 




>3831437 


73743 






50 














len = 


941 


nex = 


3 






Init 


6216 


6447 


+ 


0 




Intr 


6552 


6618 


+ 


0 


55 


Term 


6704 


7156 


+ 


0 




>3831448 


79671 








len = 


1169 


nex = 


2 





Reference No. 2750-942P 



Init 
Term 

>3831448 

len = 

Sngl 

>3831448 

len = 

Sngl 

>3831448 

len = 

Term 
Intr 
Intr 
Intr 
Intr 
Intr 
Init 

>3849811 

len = 

Term 
Intr 
Intr 
Init 

>3849811 

len = 

Term 
Intr 
Intr 
Init 

>3849811 

len = 

Term 
Intr 
Init 

>3849811 

len = 



40447 40831 
41264 41615 

/17467 

395 nex = 

61543 61937 

/35104 

1522 nex = 

8526 7008 

/37237 

3 011 nex = 



88224 
88432 
88649 
89505 
89768 
89984 
90722 



87712 
88334 
88483 
89463 
89673 
89868 
90415 



79623 

1165 nex = 

896 533 

1195 997 

1462 1307 

1697 1546 

/1568 

104 2 nex = 

23646 23559 

23842 23733 

24392 24313 

24600 24516 

/30470 

1123 nex = 

23842 23486 
24392 24313 
24608 24516 

77182 

1193 nex = 



Term 23646 23450 
Intr 23842 23733 
60 Intr 24392 24313 



Reference No. 2750-942P 



Init 

>3849811 

len = 

Sngl 

>3849811 

len = 

Init 
Intr 
Term 

>3849811 

Init 
Term 

>3849811 
Len = 
Sngl 

>3849811 

len = 

Init 
Intr 
Intr 
Intr 
Intr 
Term 

>3849811 

len = 

Init 
Term 

>3849811 
len = 
Sngl 

>3859590 
len = 



24642 24516 

/36468 

1339 nex = 

28830 27492 

/33391 

1311 nex = 

29563 29631 
29985 30486 
30573 30873 

/35307 

730 nex = 

29985 30486 
30573 30713 

724946 

772 nex = 

37425 38196 

/19020 

2018 nex = 

51039 51216 

51447 51550 

51656 51804 

51969 52041 

52302 52341 

52685 52767 

/18200 

1310 nex = 

59367 59612 
60232 60674 

/12305 

512 nex = 

5941 6452 

734747 

741 nex = 



Term 39398 39023 
Intr 39602 39499 
60 Init 39763 39682 



Reference No. 2750-942P 



>3859590 
len = 

5 

Init 
Term 

>3859590 

10 

len = 
Sngl 

15 >3859590 
len = 
Term 

2 0 Intr 
Intr 
Intr 
Intr 
Intr 

2 5 Intr 

Intr 
Intr 
Intr 
Init 

30 

>3859590 
len = 

3 5 Term 

Intr 
Intr 
Init 

40 >3859590 
len = 
Term 

4 5 Intr 

Init 

>3859590 
50 len = 

Sngl 
>3859658 

55 

len = 
Sngl 



/36898 

14 7 6 nex = 

53829 54125 
54213 54342 

/15259 

343 nex = 

58875 59217 

/20761 

2 429 nex = 

74605 74285 

74729 74694 

74938 74825 

75236 75018 

75451 75326 

75613 75539 

75707 75703 

75970 75814 

76141 76061 

76278 76225 

76713 76474 

/95621 

1150 nex = 

77280 77093 

77795 77518 

77994 77893 

78208 78069 

/15383 

1038 nex = 

84801 84590 
85252 84874 
85627 85486 

/29072 

712 nex = 

88825 89536 

/17194 

515 nex = 

16172 16686 



60 >3859658 



/107993 



Reference No. 2750-942P 



len = 

Sngl 

>3859658 

len = 

Sngl 

>3859658 

len = 

Init 
Intr 
Intr 
Intr 
Intr 
Intr 
Intr 
Intr 
Term 

>3859658 

len = 

Init 
Intr 
Term 

>3859658 

len = 

Term 
Init 

>3859658 

len = 

Term 
Init 

>3859658 
len = 
Sngl 

>3859658 
len = 



58 6 nex = 

16174 16759 

/33603 

409 nex = 

16552 16144 

/6589 

2209 nex = 

20101 20257 

20354 20412 

20723 20875 

20969 21017 

21116 21241 

21326 21379 

21582 21664 

21751 21856 

21928 22293 

/29040 

74 4 nex = 

21625 21664 
21751 21856 
21928 22368 

/31672 

712 nex = 

40705 40329 
41020 40903 

/109432 

610 nex = 

48314 48122 
48723 48437 

/38416 

4 03 nex = 

49905 50307 

729886 

2037 nex = 



Term 50573 50297 
Intr 50791 50663 
60 Intr 51573 51274 



Reference No. 2750-942P 



Intr 
Init 



51886 51658 
52333 52049 



>3859658 

len = 

Term 
Intr 
Intr 
Intr 
Intr 
Intr 
Intr 
Intr 
Intr 
Init 

>3859658 
len = 
Sngl 

>3859658 

len = 

Term 
Intr 
Intr 
Intr 
Intr 
Intr 
Intr 
Intr 
Init 

>3859658 

len = 

Term 
Intr 
Init 

>3860242 

len = 

Init 
Term 

>3860242 

len = 



2978 nex = 



52912 
53209 
53605 
53757 
53947 
54222 
54623 
54886 
55081 
55456 



52479 
53003 
53285 
53707 
53843 
54160 
54507 
54819 
55005 
55320 



/5743 
790 nex = 
63524 64304 
76636 
3730 nex = 



85246 
85487 
85826 
85995 
86282 
86558 
87259 
87578 
88293 



84985 
85360 
85560 
85914 
86121 
86366 
87159 
87376 
88190 



13 9 8 nex = 

91234 91166 
91430 91335 
92183 91722 

/1860 

833 nex = 

19578 19983 
20109 20390 

/123227 

13 90 nex = 



Init 22933 23272 
Intr 23512 23607 
60 Intr 23806 23845 



Reference No. 2750-942P 



Term 24006 24320 

>3860242 /29893 

5 len = 1317 nex = 

Init 38909 39035 

Intr 39156 39214 

Intr 39796 39839 

10 Term 39982 40225 

>3860242 /40339 

len = 705 nex = 

15 

Sngl 45860 45156 

>3860242 /38650 

20 len = 1396 nex = 

Sngl 46542 45147 

>3860242 /25214 

25 

len = 2937 nex = 

Term 50568 50049 

Intr 50808 50653 

30 Intr 51048 50902 

Init 52985 52151 

>3860242 /21711 

35 len = 2316 nex = 

Init 62073 62233 

Intr 62317 62379 

Intr 62477 62543 

40 Intr 62938 62988 

Intr 63081 63161 

Intr 63509 63621 

Intr 63713 63830 

Intr 63931 64057 

45 Term 64162 64388 

>3860242 /96159 

len = 1407 nex = 

50 

Term 64781 64588 

Intr 64990 64893 

Intr 65621 65504 

Init 65994 65896 

55 

>3860243 /14203 

len = 949 nex = 

60 >3860243 /19048 



Reference No. 2750-942P 



len = 1570 nex = 

init 115309 115528 

5 Intr 115614 115764 

Intr 115886 116091 

Term 116181 116399 

>386G243 /28205 

10 

len = 24 83 nex = 

Init 12779 13109 

Intr 13268 13360 

15 Intr 13462 13569 

Intr 13849 13963 

Intr 14039 14172 

Intr 14261 14362 

Intr 14472 14564 

20 Intr 14654 14789 

Term 14881 15261 

>3860243 76892 

25 len = 2155 nex = 

Init 12829 13109 

Intr 13268 13360 

Intr 13462 13569 

30 Intr 13849 13963 

Intr 14039 14172 

Intr 14261 14362 

Intr 14472 14564 

Term 14654 14983 

35 

>3860243 /113089 

len = 14 8 9 nex = 

40 Init 13495 13963 

Intr 14039 14172 

Intr 14261 14362 

Intr 14472 14564 

Term 14654 14983 

45 

>3860243 723972 



len = 1889 nex = 



Term 


16019 


15795 


Intr 


16256 


16115 


Intr 


16406 


16348 


Intr 


16535 


16500 


Intr 


16972 


16620 


Intr 


17135 


17065 


Init 


17683 


17580 



>3860243 733700 



60 len = 



910 nex = 



Reference No. 2750-942P 



Sngl 

>3860243 

len = 

Sngl 

>3860243 

len = 

Init 
Term 

>3860243 

len = 

Init 
Intr 
Term 

>3860243 

len = 

Term 
Intr 
Intr 
Init 

>3860243 

len = 

Term 
Init 

>3860243 

len = 

Term 
Intr 
Init 

>3860243 

len = 

Term 
Intr 
Init 



19112 18209 

737739 

1518 nex = 

22909 21392 

/112146 

1001 nex = 

36410 36592 
36690 36790 

73359 

2034 nex = 

58793 58908 
59274 59312 
59418 59832 

719986 

1037 nex = 

60333 60150 

60655 60423 

60984 60894 

61186 61074 

7207594 

4 07 nex = 

62033 61867 
62273 62139 

79845 

1062 nex = 

62033 61843 
62371 62139 
62751 62661 

7120911 

959 nex = 

62033 61946 
62371 62139 
62751 62661 



len = 



1039 nex = 



Reference No. 2750-942P 



Intr 62371 62139 
Init 62751 62661 



Init 
Intr 
Intr 
Intr 
Intr 
Intr 
Intr 
Intr 
Term 



2090 

73691 
74216 
74385 
74585 
74744 
74933 
75094 
75307 
75511 



74012 
74300 
74501 
74649 
74832 
75012 
75205 
75426 
75780 



Term 
Intr 

2 5 Intr 
Intr 
Intr 
Intr 
Intr 

30 Init 
>3868722 
len = 

35 

Sngl 

>3868723 

40 len = 

Init 
Intr 
Intr 

45 Intr 
Intr 
Term 



>3868723 



50 



len = 
Sngl 

55 >3868723 
len = 



28 91 nex = 

96072 95749 

96363 96208 

96564 96466 

96729 96655 

96877 96809 

97159 96961 

97879 97774 

98639 97974 

/125396 

17 9 nex = 

14027 13849 

/35059 

33 73 nex = 

47360 47692 

47785 47840 

47939 47993 

49588 49668 

49756 49979 

50524 50732 

/101605 

415 nex = 

52231 51821 

/42560 

1118 nex = 



Term 62440 61893 
60 Init 63010 62468 



Reference No. 2750-942P 



>3869062 
len = 

5 

Init 
Term 

>3869063 

10 

len = 

Init 
Term 

15 

>3869063 
len = 
20 Sngl 
>3869063 
len = 

25 

Init 
Intr 
Intr 
Intr 

3 0 Intr 

Intr 
Term 

>3869064 

35 

len = 

Term 
Intr 

4 0 Intr 

Intr 
Intr 
Intr 
Intr 

45 Init 
>3869065 
len = 

50 

Term 
Intr 
Init 

55 >3869065 
len = 
Term 

60 Intr 



/15995 

1300 nex = 

5593 5774 
5873 6151 

/12997 

1210 nex = 

39503 39620 
39709 40708 

/2393 

813 nex = 

49706 50518 

/37862 

2178 nex = 

65062 65329 

65646 65723 

65820 65987 

66119 66167 

66275 66348 

66564 66657 

66756 67239 

737349 

2599 nex = 

1590 525 

1966 1776 

2274 2069 

2446 2367 

2588 2535 

2723 2663 

2912 2839 

3118 2994 

/119129 

1317 nex = 

30028 29747 
30375 30116 
31063 30948 

/17872 

1112 nex = 

36192 35871 
36608 36279 



Reference No. 2750-942P 



Init 36982 36828 

>3869065 /28066 

5 len = 202 nex = 

Sngl 58498 58699 

>3869065 729368 

10 

len = 2470 nex = 

Init 65929 66029 

Intr 66279 66586 

15 Intr 66669 66764 

Intr 66861 66938 

Intr 67036 67212 

Intr 67884 67952 

Term 68051 68397 

20 

>3869066 /40790 

len = 2119 nex = 

25 Init 35935 36190 

Intr 36417 36492 

Intr 36599 36672 

Intr 36770 36975 

Intr 37095 37410 

30 Intr 37502 37655 

Term 37756 38053 

>3869066 /118207 

35 len = 685 nex = 

Sngl 47112 46433 

>3869067 733343 

40 

len = 2470 nex = 

init 22709 22777 

intr 22866 22916 

45 Intr 23160 23271 

Intr 23380 23424 

Intr 23921 24046 

intr 24373 24479 

Intr 24569 24640 

50 Term 24751 25178 

>3869067 720436 

len = 2137 nex = 

55 

Init 25230 25389 

Intr 25682 25841 

Intr 25964 26519 

Intr 26612 26883 

60 Term 27050 27366 



Reference No. 2750-942P 



>3869067 
len = 



Init 


29168 


29584 


+ 


0 


Intr 


29688 


30014 


+ 


0 


Intr 


30180 


30332 


+ 


0 


Intr 


30470 


30594 


+ 


0 


Intr 


30707 


30802 


+ 


0 


Term 


30894 


31285 


+ 


0 



>3869067 

15 len = 

Term 
Intr 
Intr 

2 0 Intr 
Init 

>3869067 

25 len = 

Init 
Intr 
Term 



1210 nex = 

31596 31244 

31767 31669 

31964 31870 

32299 32185 

32449 32382 

/12033 



32650 32895 
32975 33207 
33308 33974 



>3869067 

len = 

Term 
Intr 
Intr 
Init 



1018 nex = 

34260 34052 

34424 34339 

34606 34490 

35069 34689 



40 >3869067 



Init 
Intr 
Intr 
Intr 
Intr 
Intr 
Intr 
Intr 
Intr 
Intr 
Term 



/40511 



2721 

37516 
37720 
37991 
38371 
38556 
38753 
39082 
39226 
39654 
39813 
40028 



nex = 

37614 
37836 
38076 
38459 
38611 
38833 
39147 
39318 
39718 
39948 
40236 



6 0 Sngl 



4200 3268 



0 



Reference No. 2750-942P 



>3869067 
len = 

5 

Term 
Intr 
Init 

10 >3869067 
len = 



/123228 

999 nex = 

48536 48214 
48756 48626 
49212 48805 



15 



Term 
Intr 
Intr 
Intr 
Intr 
Intr 

2 0 Intr 
Intr 
Intr 
Intr 
Intr 

25 Init 
>3869067 
len = 

30 

Init 
Term 



35 



>3869068 



2567 

53194 
53329 
53480 
53617 
53823 
54011 
54167 
54334 
54762 
54939 
55148 
55468 



len = 
Sngl 

40 >3869068 
len = 
Sngl 
>3B69068 
len = 



45 



50 



nex = 

52902 
53276 
53412 
53549 
53701 
53910 
54093 
54244 
54425 
54853 
55058 
55243 



Term 
Intr 
Intr 
Intr 
Init 

>3869068 

len = 



/17375 

576 nex = 

69414 69652 
69732 69989 

/17883 

535 nex = 

16765 16231 

/30700 

310 nex = 

18011 17704 

/29310 

2516 nex = 

18580 17714 

19092 18662 

19286 19183 

19439 19371 

20229 20088 

/112999 

658 nex = 



60 Sngl 27286 27943 



0 



Reference No. 2750-942P 



len = 


556 




1 




Sngl 


82911 


82356 


- 


0 




/106959 






1 n - 
en 


404 




1 




ng 


2091 


1688 




0 


>3869069 


/12707 






len = 


3200 




7 




Init 


30363 


30606 


+ 


0 


Intr 


31683 


31882 


+ 


0 


Intr 


32143 


32357 


+ 


0 


Intr 


32433 


32523 




0 


Intr 


32605 


32908 




0 


Intr 


33006 


33066 




0 


Term 


33143 


33562 




0 


>3859069 


/101843 






len = 


2862 


nex = 


10 




Init 


33963 


34200 


+ 


0 


Intr 


34320 


34353 




0 


Intr 


34515 


34655 


+ 


0 


Intr 


34751 


34838 


+ 


D 


Intr 


34915 


34995 


+ 


0 




35079 


35143 




0 


I'^t^ 


35303 


35443 




0 


Intr 


35530 


35643 


+ 


0 


Intr 


35847 


35877 


+ 


0 


Term 


36333 


36824 


+ 


0 


>3869069 


/15577 






len 


1904 


nex = 






Term 


2612 


1677 


- 


0 


Intr 


3118 


2808 


: 


0 


Init 


3580 


3284 




0 


>3869069 


736656 






len = 


1892 


nex = 


3 




Term 


2612 


1692 




0 


Intr 


3118 


2808 




0 


Init 


3583 


3284 




0 


>3869069 


/206508 






len = 


1588 




2 





Reference No. 2750-942P 





Init 


39936 


40013 


+ 






40104 


40647 


+ 




>3869069 


/20484 




5 












len = 


1176 


nex = 


5 




Term 


52941 


52689 


_ 




Intr 


53185 


53027 


- 


10 


Intr 


53373 


53284 






Intr 


53608 


53463 






Init 


53864 


53704 


_ 




>3869069 


/1490 




15 












len = 


1592 


nex = 


5 




Term 


54883 


54300 






Intr 


55072 


54965 


- 


20 


Intr 


55295 


55169 








55587 


55538 






Init 


55891 


55681 






>3869069 


/5507 




25 












len = 


2151 


nex = 


9 




Init 


7968 


8130 


+ 




Intr 


8391 


8430 


+ 


30 


Intr 


8506 


8598 


+ 




Intr 


8713 


8761 


+ 




Intr 


8841 


8947 


+ 




Intr 


9024 


9111 


+ 




Intr 


9197 


9400 




35 


Intr 


9480 


9533 


+ 




Term 


9618 


10118 


+ 




>3869069 


/33741 




40 


len = 


431 


nex = 


1 




Sngl 


9688 


10112 


+ 




>3869070 


/12673 




45 












len = 


610 


nex = 


1 




Snal 


2159 


1550 




50 


>386907 1 


/486 






len = 


2937 


nex = 


9 




Term 


11670 


11336 




55 


Intr 


12043 


11759 






Intr 


12404 


12123 






Intr 


12784 


12477 






Intr 


13308 


12859 






Intr 


13539 


13389 




60 


Intr 


13791 


13627 





Reference No. 2750-942P 



Intr 
Init 

>3869071 

Term 
Intr 
Intr 
Intr 
Init 

>3869071 

len = 

Term 
Intr 
Intr 
Intr 
Intr 
Init 

>3869071 

len = 

Sngl 

>3869071 

len = 

Sngl 

>3869071 

len 

Term 
Intr 
Intr 
Intr 
Intr 
Intr 
Init 

>3869072 

len = 

Init 
Intr 
Term 

>3869072 

len = 



13989 13873 
14272 14165 



1270 nex = 

16190 16024 

16555 16484 

16728 16660 

16951 16821 

17290 17177 

/9813 

1278 nex = 

16190 16019 

16395 16323 

16555 16484 

16728 16660 

16951 16821 

17296 17177 

/7192 

839 nex = 

30021 29183 

722955 

730 nex = 

38598 37870 

/38211 

2531 nex = 

57057 56592 

57299 57147 

57562 57440 

57858 57674 

58188 58099 

58480 58304 

58813 58761 

/94805 

862 nex = 

18597 18828 
18912 19014 
19090 19458 

/17342 

956 nex = 



Reference No. 2750-942P 



Term 
Init 



>3869072 
len = 



Term 
Init 



>3869072 
len = 



31251 30971 
31926 31676 



/158846 
97 7 nex ' 



31251 30955 
31911 31676 



/117248 
10 65 nex ■■ 



Term 
Init 

>3869072 

len = 

Term 
Init 

>3869072 

len = 

Term 
Init 

>3869072 

len = 

Term 
Intr 
Intr 
Intr 
Intr 
Intr 
Init 

>3869072 

len = 

Sngl 

>3869072 

len = 

Sngl 

>3869072 

len = 



31251 30869 
31933 31676 



/8695 
1011 nex 



31251 
31933 



30923 
31676 



/20422 
821 nex 



31251 31113 
31933 31676 



/7107 
2532 nex 



32844 
33363 
33767 
34032 
34209 
34421 
34651 



32601 
33255 
33623 
33877 
34148 
34295 
34556 



/18208 
861 nex = 
44310 44409 
/26609 
190 nex = 
53681 53499 
/39314 
1182 nex = 



60 Term 



53837 53492 



0 



Reference No. 2750-942P 



Intr 
Init 



54380 
54673 



54210 
54468 



len = 537 nex = 

Sngl 10605 11141 
10 >3869073 736279 



>3869074 
len = 
Sngl 

>3869074 

len = 

Init 
Intr 
Intr 
Intr 
Intr 
Intr 
Term 



18587 19044 
/102795 
632 nex = 
10415 9784 
/7236 
1613 nex = 



40937 
41262 
41618 
41836 
41988 
42160 
42351 



41184 
41385 
41726 
41895 
42053 
42251 
42549 



len = 

Term 
Intr 

4 0 Intr 
Intr 
Intr 
Intr 
Intr 

4 5 Intr 

Intr 
Intr 
Init 

50 >3869074 
len = 
Term 

5 5 intr 

Intr 
Init 



2576 

2325 
2504 
2685 
2944 
3117 
3305 
3550 
3788 
3973 
4165 
4344 



1773 
2410 
2589 
2776 
3057 
3204 
3439 
3634 
3903 
4054 
4288 



1171 nex 



42980 
43303 
43699 
43851 



42681 
43190 
43568 
43806 



>3869074 

60 



/16926 



Reference No. 2750-942P 



Term 
Intr 
5 Intr 
Init 

>3869074 

10 len = 

Term 
Intr 
Intr 

15 Intr 
Intr 
Intr 
Intr 
Intr 

20 Intr 
Intr 
Intr 
Intr 
Init 

25 

>3869074 
len = 
3 0 Sngl 
>3869074 
len = 

35 

Sngl 
>3869074 
40 len = 

Sngl 
>3869074 

45 

len = 
Sngl 

50 >3869074 
len = 
Sngl 

55 

>3869074 



1210 nex = 

42980 42679 

43303 43190 

43699 43568 

43876 43806 

/8940 

36 54 nex = 

2325 1767 

2604 2410 

2685 2589 

2944 2776 

3117 3057 

3305 3204 

3550 3439 

3788 3634 

3973 3903 

4165 4054 

4428 4288 

4653 4519 

5420 4898 

/149198 

2 78 nex = 

58874 59151 

/40315 

414 nex = 

58874 59287 

/154419 

43 3 nex = 

58874 59306 

729762 

854 nex = 

58874 59727 

79669 

7 90 nex = 

58941 59724 

711244 

430 nex = 



60 Sngl 59119 59548 



Reference No. 2750-942P 



>3869074 /2815 

len = 671 nex = 

5 

Sngl 62233 62903 

>3869074 /113158 

10 len = 405 nex = 

Sngl 73937 73533 

>3869074 /10401 

15 

len = 491 nex = 

Sngl 74021 73531 

20 >3869074 /24707 

len = 730 nex = 

Term 74030 73533 

25 Init 74252 74212 

>3869074 /206301 

len = 692 nex = 

30 

Term 74030 73784 

Init 74475 74212 

>3869074 739446 

35 

len = 1883 nex = 

Term 74030 73781 

intr 74502 74212 

40 Init 75663 74775 

>3869074 /28168 

len = 1097 nex = 

45 

Term 8429 7843 

Init 8939 8715 

>3869075 /97900 

50 

len = 93 7 nex = 

Init 10363 10398 

Term 10768 11299 

55 

>3869075 /13188 

len = 1908 nex = 

60 Term 12350 12052 



Reference No. 2750-942P 



Intr 12644 12437 

Intr 13278 12727 

Intr 13478 13381 

init 13959 13755 

5 

>3869075 /31648 

len = 1810 nex = 

10 Init 18786 18937 

Intr 19201 19479 

Term 20326 20591 

>3869075 /20274 

15 

len = 1795 nex = 

Init 29646 29969 

Term 30976 31440 

20 

>3869075 /9242 

len = 938 nex = 

25 Init 31907 32136 

Term 32507 32844 

>3869075 /25655 

30 len = 896 nex = 

Init 34505 34737 

Term 35146 35400 

35 >3869075 /26281 

len = 894 nex = 

Init 34509 34737 

40 Term 35146 35402 

>3869075 73657 

len = 490 nex = 

45 

Sngl 36909 37394 

>3869075 /30416 

50 len = 2410 nex = 



Term 


62622 


62304 


Intr 


62778 


62704 


Intr 


63019 


62876 


Intr 


63302 


63104 


Intr 


63790 


63564 


Intr 


63930 


63882 


Init 


64704 


64524 



60 >3869075 /1728 



Reference No. 2750-942P 



Init 

5 Intr 
Intr 
Intr 
Term 

10 >3869075 
len = 
Init 

15 Intr 
Intr 
Term 



1844 nex = 

71156 71325 

71732 71877 

71985 72203 

72292 72504 

72590 72999 

/42640 

1307 nex = 

71693 71877 

71985 72203 

72292 72504 

72590 72999 



733687 



Init 
Term 

25 

>3873174 
len = 

3 0 Term 

Intr 
Intr 
Intr 
Init 

35 

>3873174 
len = 

4 0 Term 

Intr 
Init 



Term 
Intr 

50 Intr 
Intr 
Init 



>3873174 
len = 
Sngl 



1549 nex = 

7633 8264 
8616 8707 

/148790 

1397 nex = 

27243 26811 

27470 27381 

27601 27558 

27797 27701 

28207 27989 

/101691 

67 0 nex = 

27601 27558 
27797 27701 
28227 27989 

738697 

14 7 2 nex = 

27243 26768 

27470 27381 

27601 27558 

27797 27701 

28239 27989 

727203 

776 nex = 

35931 36706 



60 >3873174 723916 



Reference No. 2750-942P 



Sngl 

5 

>3873174 
len = 
10 Sngl 
>3873174 
len = 

15 

Sngl 
>3873174 
2 0 len = 

Sngl 
>3873174 

25 

len = 

Term 
Intr 

30 Intr 
Intr 
Init 

>3873174 

35 

len = 
Sngl 

40 >3873174 
len = 
Term 

45 Init 
>3873174 
len = 

50 

Term 
Intr 
Intr 
Intr 

55 Intr 
Intr 
Intr 
Intr 
Init 

60 



635 nex = 

3852 3218 

/106667 

572 nex = 

3852 3281 

77438 

1037 nex = 

45314 46350 

72558 

1620 nex = 

6341 7960 

740058 

2300 nex = 

62162 61975 

62557 62385 

63236 63043 

63711 63642 

64274 64013 

7125844 
74 7 nex = 
7214 7960 

737036 

1550 nex = 

74393 73816 
75365 74610 

737030 

2803 nex = 

97302 97150 

97507 97385 

97791 97597 

98007 97879 

98258 98096 

98487 98352 

98916 98575 

99091 99011 

99952 99437 



Reference No. 2750-942P 



Term 19497 19031 
intr 19671 19581 
Init 19813 19745 



2950 



Term 
Intr 
Intr 
Intr 
Intr 
Intr 
Intr 
Intr 
Intr 
Init 



19497 
19671 
19813 
20265 
20516 
20681 
21008 
21560 
21743 
21973 



19026 
19581 
19745 
20078 
20382 
20607 
20768 
21482 
21651 
21878 



len 



2611 



Term 
Intr 
Intr 
Intr 
Intr 
Intr 
Intr 
Intr 
Intr 
Init 



28061 
28235 
28405 
28839 
29013 
29188 
29519 
29749 
30098 
30451 



27841 
28145 
28340 
28793 
28929 
29102 
29446 
29666 
30059 
30230 



Init 
Term 



31479 31619 
32162 32660 



>3885325 
len = 



/29321 
936 nex 



50 Init 31484 31619 

Term 32162 32419 



>3885325 723675 
55 len = 1196 nex 



Init 31488 31619 
Term 32162 32683 



60 >3885325 



/7008 



Reference No. 2750-942P 



len = 
Init 

5 Term 
>3885325 

len = 

10 

Term 
Intr 
Init 

15 >3885325 
len = 
Init 

2 0 Intr 
Term 

>3885325 

2 5 len = 

Term 
Init 

30 >3892698 
len = 
Sngl 

35 

>3892698 

len = 

4 0 Term 
Init 

>3892698 

4 5 len = 

Sngl 

>3892698 

50 

len = 

Term 
Init 

55 

>3892698 
len = 

6 0 Term 



9 96 nex = 

31541 31619 
32162 32536 

735542 

4002 nex = 

76796 76680 
77285 77136 
77644 77555 

/8450 

1248 nex = 

94767 95217 
95576 95626 
95755 96014 

/30446 

942 nex = 

98533 97856 
98797 98628 

/5111 

852 nex = 

49208 48945 

735338 

217 0 nex = 

47914 47683 
49208 49126 

736849 

772 nex = 

51041 50270 

718749 

1672 nex = 

56961 55850 
57521 57295 

711009 

3236 nex = 

54492 54297 



Reference No. 2750-942P 



Intr 


54670 


54612 


Intr 


54925 


54776 


Intr 


55085 


54993 


Intr 


56961 


56851 


Init 


57532 


57295 



>3892698 75348 

len = 535 nex = 

10 

Sngl 74771 74237 

>3894156 /11351 

15 len = 2097 nex = 

Init 37638 37696 

Intr 37801 38037 

Intr 38134 38351 

20 Term 38433 38887 

>3894156 /1B246 

len = 2050 nex = 

25 

Init 44721 44808 

Intr 44904 44960 

Intr 45125 45231 

Intr 45327 45398 

30 Intr 45557 45610 

Intr 45697 45772 

Intr 45919 45980 

Intr 46137 46196 

Term 46455 46618 

35 

>3894156 /107424 

len = 2332 nex = 

40 Init 44688 44808 

Intr 44904 44960 

Intr 45125 45231 

Intr 45327 45398 

Intr 45557 45510 

45 Intr 45697 45772 

Intr 45919 45980 

Intr 46137 46196 

Term 46455 47019 

50 >3894156 /114042 

len = 1556 nex = 

Term 47511 46634 

55 Init 48189 47592 

>3894156 738635 

len = 22 7 6 nex = 

60 



Reference No. 2750-942P 



Init 
Intr 
Intr 
Intr 
Term 

>3894156 

len = 

Init 
Intr 
Intr 
Intr 
Term 



68187 68540 

68715 68747 

68929 69157 

69743 69869 

69944 70462 

/41104 

1762 nex = 

68444 68540 

68715 68747 

68929 69157 

69743 69869 

69944 70205 

/36320 





len = 


158 








2 0 














Sngl 


19935 


19778 








>3894179 


/13608 






25 


len = 


1419 




3 






Init 


3877 


4048 


+ 


0 




Intr 


4194 


4398 


+ 


0 




Term 


5037 


5295 


+ 


0 


30 














>3894179 


/11854 








len = 


2214 


nex = 


6 




35 


Term 


38528 


38322 




0 




Intr 


38784 


38647 




0 




Intr 


39404 


39281 




0 




Intr 


39652 


39529 




0 




Intr 


39874 


39742 




0 


40 


Init 


40535 


40056 




0 




>3894179 


/125677 








len = 


676 


nex = 


2 




45 














Init 


6794 


6915 




0 




Term 


7018 


7469 


+ 


0 



>3894179 
len = 
Sngl 

>3894179 
len = 



79236 
575 nex = 
76198 76772 
/15522 
527 nex = 



Sngl 76584 76058 

60 



0 



Reference No. 2750-942P 



>3894179 /31116 

len = 459 nex = 

5 Sngl 80390 80848 

>3894179 /30800 

len = 409 nex = 

10 

Sngl 83631 83223 

>3894179 /37949 

15 len = 1100 nex = 

Term 83667 83218 

Intr 83835 83752 

Intr 83992 83914 

20 Init 84302 84098 

>3894179 /11537 

len = 953 nex = 

25 

Sngl 8992 9944 

>3927822 732542 

3 0 len = 500 nex = 

Sngl 1179 680 

>3927822 /2058 

35 

len = 927 nex = 

Sngl 26615 25689 

40 >3927822 735476 

len = 589 nex = 

Term 28078 27741 

45 Init 28329 28179 

>3927822 731457 

len = 2687 nex = 

50 

Term 28078 27729 

Intr 28328 28179 

Intr 28740 28637 

Intr 28940 28833 

55 intr 29186 29082 

Intr 29361 29288 

intr 29506 29446 

Intr 29658 29584 

Intr 29879 29769 

60 init 30415 29968 



Reference No. 2750-942P 



>3927822 
len = 

5 

Init 
Intr 
Intr 
Intr 

10 Intr 
Intr 
Intr 
Term 

15 >3927822 



2351 nex = 

49567 49905 

49994 50053 

50131 50202 

50276 50397 

50479 50619 

50710 50817 

50893 51145 

51224 51814 

722382 



6 60 nex = 



Sngl 66099 65440 





len = 


1250 


nex = 






25 


Term 


85753 


85484 


- 


0 




Intr 


85900 


85833 




0 




Intr 


86011 


85978 


- 


0 




Intr 


86139 


86108 




0 




Intr 


86442 


86391 




0 


30 


Init 


86733 


86530 




0 




>3927822 


7948 








len = 


1606 


nex = 


1 




35 














Sngl 


88569 


86964 




0 




>3928074 


77805 






40 


len = 


2278 


nex = 


5 






Init 


10584 


10991 


+ 


0 




Intr 


11688 


11817 


+ 


0 




Intr 


12087 


12157 


+ 


0 


45 


Intr 


12246 


12326 


+ 


0 




Term 


12682 


12861 


+ 


0 




>3928074 


717830 






50 


len = 


819 


nex = 


2 






Init 


15343 


15421 


+ 


0 




Term 


15807 


16161 


+ 


0 


55 


>3928074 


738129 








len = 


1666 


nex = 


4 





Term 5915 5673 
60 Intr 6499 6134 



Reference No. 2750-942P 



Intr 
Init 

>3928074 

5 

len = 

Init 
Intr 

10 Term 
>3928074 
len = 

15 

Term 
Intr 
Intr 
Intr 

20 Init 
>3928074 
len = 

25 

Term 
Init 

>3980374 

30 

len = 

Term 
Init 

35 

>3980374 

len = 

4 0 Init 
Intr 
Term 

>3980374 

45 

len = 

Term 
Intr 

50 Init 
>3980374 
len = 

55 

Init 
Intr 
Intr 
Intr 

60 Term 



7048 6578 
7338 7135 

/39558 

2141 nex = 

75510 76133 
76482 76604 
76976 77204 

/20810 

1514 nex = 

79125 78844 

79336 79217 

79585 79475 

79927 79717 

80038 80002 

/11662 

89 8 nex = 

8846 8488 
9385 9139 

734873 

7 38 nex = 

116215 115949 
116686 116328 

/17240 

194 8 nex = 

13701 14199 
14385 14654 
14795 15648 

/13720 

1219 nex = 

16100 15850 
16298 16222 
17068 16911 

75112 

1528 nex = 

17481 17533 

17782 17834 

18312 18350 

18477 18546 

18662 18881 



Reference No. 2750-942P 



>3980374 
len 

5 

Init 
Intr 
Intr 
Intr 

1 0 Term 
>3980374 
len = 

15 

Sngl 

>3980374 

20 len = 

Init 
Term 

25 >3980374 
len = 
Init 

3 0 Term 
>3980374 
len = 



Init 
Term 



Init 
Intr 

4 5 Term 
>3980374 
len = 

50 

Init 
Term 



725736 

1510 nex = 

17406 17533 

17782 17834 

18312 18350 

18477 18546 

18662 18914 

725828 

691 nex = 

25933 26623 

738727 

950 nex = 

27265 27698 
27792 28214 

76528 

1035 nex = 

34104 34489 
34608 35138 

724361 

89 8 nex = 

40839 41193 
41271 41736 

739038 

981 nex = 

4299 4497 
4750 4890 
5212 5279 

76157 

741 nex = 

45228 45692 
45776 45968 

723439 

2314 nex = 



Init 46866 46968 
Intr 47598 47735 
60 Intr 47926 48071 



Reference No. 2750-942P 



intr 48238 48357 

Intr 48453 48665 

Intr 48761 48841 

Term 48926 49179 

5 

>3980374 733426 

len = 1249 nex = 

10 Init 47926 48071 

Intr 48238 48357 

Intr 48453 48665 

Term 48761 48842 

15 >3980374 /2618 

len = 1394 nex = 

Term 54808 54506 

20 Intr 55007 54901 

Intr 55303 55087 

Intr 55595 55388 

Init 55899 55798 

25 >3980374 /14555 

len = 1516 nex = 



Term 


57704 


57309 


Intr 


57909 


57803 


Intr 


58204 


57988 


Intr 


58508 


58301 


Init 


58824 


58614 



35 >3980374 /1637 

len = 1285 nex = 

Term 66552 66306 

40 Intr 66749 66643 

Intr 67079 66860 

Intr 67374 67167 

Init 67590 67476 

45 >3980374 /342 

len = 1796 nex = 

Init 8035 8458 

50 Intr 8947 9099 

Intr 9183 9280 

Term 9375 9830 

>3980374 728554 

55 

len = 1450 nex = 

Init 8140 8458 

Intr 8947 9099 

60 Intr 9183 9280 



Reference No. 2750-942P 



>3980374 
5 len = 

Sngl 
>3983533 

10 

len = 

Init 
Term 

15 

>3985931 

len = 

2 0 Term 
Intr 
Intr 
Init 

25 >3985931 
len = 
Sngl 

30 

>3985931 
len = 

3 5 Term 

Intr 
Intr 
Intr 
Intr 

4 0 Intr 

Intr 
Intr 
Init 

45 >3985931 
len = 
Sngl 

50 

>3985931 
len = 
55 sngl 
>3985931 
len = 

60 



9375 9586 

/21655 

310 nex = 

9418 9723 

/40760 

811 nex = 

55705 55965 
56057 56515 

/11375 

1235 nex = 

27019 26645 

27179 27114 

27323 27258 

27879 27598 

/32856 

1429 nex = 

40537 39109 

74623 

2005 nex = 

43824 43509 

43989 43913 

44159 44065 

44295 44251 

44544 44399 

44741 44654 

44873 44843 

45006 44981 

45513 45090 

/109141 

67 0 nex = 

73609 72960 

/19506 

69 2 nex = 

9471 10162 

/119485 

6 29 nex = 



1287 

+ 0 



+ 0 



2 

+ 0 
+ 0 



0 
0 
0 
0 



1 

0 



9 

0 
0 
0 
0 
0 
0 
0 
0 
0 



1 

0 



1 



Reference No. 2750-942P 



Sngl 9473 10101 

>3985931 /4831 

len = 565 nex = 

Sngl 9591 10155 

>3985933 /39005 

len = 4007 nex = 



Init 


28319 


28431 


+ 


0 


Intr 


28805 


28887 


+ 


0 


Intr 


28996 


29125 


+ 


0 


Intr 


29224 


29303 


+ 


0 


Intr 


29443 


29529 


+ 


0 


Intr 


29623 


29747 


+ 


0 


Intr 


29839 


29899 


+ 


0 


Intr 


30132 


30209 


+ 


0 


Intr 


30301 


30406 


+ 


0 


Intr 


30490 


30635 


+ 


0 


Term 


30833 


31065 


+ 


0 



25 >3985934 
len = 
Sngl 

30 

>3985934 

len = 

35 Init 
Intr 
Intr 
Intr 
Intr 

4 0 Intr 
Intr 
Intr 
Term 

45 >3985934 
len = 



/26805 
677 nex = 
19613 18937 
725275 
2482 nex = 



37180 
37890 
38268 
38434 
38687 
38925 
39196 
39362 
39502 



37552 
38045 
38336 
38604 
38830 
39083 
39261 
39421 
39661 



Init 


38434 


38604 


+ 


Intr 


38687 


38830 


+ 


Intr 


38925 


39083 


+ 


Intr 


39196 


39261 


+ 


Intr 


39362 


39421 


+ 


Term 


39502 


39558 


+ 



>3985934 
len = 



60 Init 3984 4155 



0 



Reference No. 2750-942P 



Intr 
Intr 
Term 



>3985934 

20 

len = 



4879 
6176 
6691 



5952 
6354 
7351 



1289 

+ 0 

+ 0 

+ 0 

2 

+ 0 

+ 0 

2 

+ 0 

+ 0 



>3985934 

len = 



Init 
Term 



>3985934 
len = 



Init 
Term 



/31667 
1184 nex 



40094 
40861 



40516 
41277 



/118260 
926 nex 



40294 40516 
40861 41219 



/13962 
568 nex 



Sngl 42221 41654 

25 >3985934 /32925 

len = 44 7 nex = 1 

Sngl 43700 43254 

30 

>3985934 /14816 

len = 1140 nex = 4 

35 init 48454 48662 •+ 

Intr 48749 48979 ^ 

Intr 49063 49263 -t 

Term 49374 49593 ^ 

40 >3985949 /158734 

len = 624 nex = 1 

Sngl 10055 10678 

45 

>3985949 /15880 

len = 1150 nex = 2 

50 Term 19933 19551 

init 20691 20400 

>3985949 /17909 

55 len = 3262 nex = 8 

init 51683 52348 

Intr 52522 53058 

intr 53154 53240 

60 intr 53329 53399 



Reference No. 2750-942P 



Intr 53470 53546 

Intr 53652 53799 

Intr 53897 54058 

Term 54365 54944 

5 

>3985949 726535 



len = 1492 nex = 



Init 


62051 


62540 


Intr 


62742 


62879 


Intr 


62957 


63043 


Intr 


63118 


63185 


Term 


63285 


63542 



>3985950 /102285 

len = 436 nex = 

20 Sngl 22343 22778 

>3985952 /9393 



len = 3254 nex = 



Init 


12815 


13266 


Intr 


13492 


13653 


Intr 


13771 


13957 


Intr 


14238 


14663 


Intr 


15037 


15444 


Term 


15529 


16068 



>3985952 /11984 

35 len = 2739 nex = 

Init 20557 20824 

Intr 20898 20956 

intr 21064 21139 

Intr 21217 21288 

Intr 21367 21454 

Intr 21528 21780 

Intr 21897 22012 

intr 22111 22331 

intr 22422 22453 

Intr 22546 22648 

Intr 22763 22943 

Term 23026 23295 

50 >3985952 /207193 

len = 204 nex = 

Sngl 23109 23312 

55 

>3985952 73151 

len = 537 nex = 

60 Term 40208 40095 



40 



45 



Reference No. 2750-942P 



Init 40631 40491 

>3985952 /113501 

5 len = 910 nex = 

Term 40208 39985 

Init 40888 40491 

10 >3985952 738966 

len = 1279 nex = 

Term 40208 40095 

15 Init 41373 40491 

>3985952 /35731 

len = 2127 nex = 

Term 41847 41632 

Intr 41987 41928 

mtr 42146 42067 

Intr 42309 42227 

Intr 42525 42409 

Intr 42673 42596 

Intr 42917 42848 

Intr 43184 43072 

Intr 43373 43272 

Init 43758 43458 

>3985952 /2064 

len = 2367 nex = 

35 

Init 49630 49929 

Intr 50622 50752 

Intr 51041 51168 

Intr 51324 51530 

40 Term 51678 51996 

>3985952 /41430 

len = 2159 nex = 

45 



Init 


52202 


52485 


Intr 


52919 


53097 


Intr 


53191 


53274 


Intr 


53631 


53700 


Intr 


53796 


53929 


Term 


54020 


54360 



>3985952 77925 
55 len = 1873 nex = 

Sngl 61124 62996 



>3985952 712726 

60 



Reference No. 2750-942P 



len = 291 nex = 

Sngl 62743 63033 

5 >3985952 /6129 

len = 62 9 nex = 

Sngl 70947 71575 

10 

>3985954 /850 

len = 2781 nex = 

15 Term 22960 22894 

Intr 23249 23031 

Intr 23819 23641 

Intr 24255 24075 

Intr 24624 24497 

20 Intr 24745 24704 

Intr 24912 24833 

Intr 25199 25105 

Init 25674 25295 

25 >3985954 /39002 

len = 2234 nex = 

Init 51257 51740 

30 Intr 51829 52098 

Term 52600 53490 

>3985955 /33506 

35 len = 2022 nex = 

Term 12286 12055 

Intr 12479 12381 

Intr 12746 12565 

40 Intr 12936 12836 

Intr 13257 13097 

Init 14076 13714 

>3985955 /151935 

45 

len = 175 nex = 

Sngl 55158 55332 

50 >3985955 722697 

len = 1494 nex = 

Term 56280 55912 

55 Intr 56493 56416 

Intr 56671 56580 

Intr 56947 56897 

intr 57238 57184 

Init 57405 57326 

60 



Reference No. 2750-942P 





len = 


1548 


nex = 


4 




5 


Init 


61 


157 




0 




Intr 


630 


706 


+ 


0 




Intr 


796 


868 


+ 


0 




Term 


974 


1608 






10 


>3985955 


/33522 








len = 


1783 


nex = 


0 






>3985957 


/108603 






15 














len 


472 


nex = 


1 






1 

ng 


14977 


15448 


+ 


0 


2 0 


>3985957 


/38105 








len = 


2110 


nex = 


6 






Init 


2224 


2751 


+ 


0 


25 


Intr 


3237 


3409 


+ 


0 




Intr 


3511 


3559 


+ 


0 




Intr 


3661 


3827 








Intr 


3920 


4003 


+ 


0 




Term 


4106 


4329 


+ 


0 


3 0 














>3985957 


799783 








1 n - 
en 


770 


nex = 


1 




^ c 


ng 


32397 


33166 


+ 


0 




>3985957 


735856 








len = 


2121 


nex = 


6 




40 














Init 


51112 


51715 


+ 


0 




Intr 


52102 


52204 




0 






52327 


52415 


+ 


0 




Intr 


52619 


52710 


+ 


0 


45 


Intr 


52792 


52889 


+ 


0 




Term 


52976 


53232 


+ 


0 




>3985958 


736809 






50 


len = 


1903 


nex = 


4 






Init 


32457 


32776 


+ 


0 




Intr 


32856 


32932 


+ 


0 




Intr 


33601 


33919 


+ 


0 


55 


Term 


33996 


34359 


+ 


0 




>3985958 


736602 








len = 


2296 


nex = 


5 





Reference No. 2750-942P 



Init 


5089 


5425 


+ 


0 


Intr 


6229 


6296 


+ 


0 


Intr 


6383 


6514 


+ 


0 


Intr 


6654 


6769 


+ 


0 


Term 


6876 


7384 


+ 


0 



>3985958 
len 

1 0 

Init 
Intr 
Intr 
Intr 

15 Intr 
Term 

>3985958 

20 len = 

Term 
Intr 
Intr 

25 Init 
>3985958 
len = 

30 

Term 
Intr 
Intr 
Init 

35 

>3985958 
len = 
4 0 Sngl 
>3985958 
len = 



1821 



nex ■ 



45 



Term 
Init 



Init 
Intr 
Intr 
Term 



52395 52503 

53039 53117 

53241 53312 

53437 53619 

53708 53783 

53955 54215 

/30108 

983 nex = 

61360 61218 

61713 61665 

61976 61828 

62200 62078 

/11295 

12 7 4 nex = 

73348 72789 

73499 73432 

73829 73580 

74062 73922 

/17636 

585 nex = 

7885 7301 

/16784 

1046 nex = 

80867 80492 
81076 80939 

797883 

1074 nex = 

8598 8696 

8960 9035 

9202 9513 

9628 9671 



/mil 



60 len = 



1241 nex = 



4 



Reference No. 2750-942P 



Term 
Intr 
Intr 
Init 



18714 18255 

18983 18813 

19249 19172 

19495 19337 

/465 



Init 23467 23626 
Term 23709 24183 



Term 46405 45777 
Intr 46873 46618 
20 init 48003 47263 



Term 46405 45643 
intr 46873 46618 
Init 47683 47263 



30 >4003353 
len = 
Term 

35 Intr 
Intr 
Intr 
Intr 
Intr 

40 Intr 
Init 



2197 

3077 
3298 
3684 
3834 
4211 
4401 
4622 
4885 



2689 
3253 
3577 
3775 
4130 
4286 
4494 
4714 



25 90 nex ■ 



Init 
Intr 
Intr 
Intr 
Intr 
Intr 
Intr 
Term 



69646 
69914 
70279 
70442 
70894 
71189 
71571 
71818 



69794 
70187 
70334 
70590 
71049 
71234 
71637 
71894 



60 Sngl 72936 72485 



0 



Reference No. 2750-942P 







298 


nex = 


1 




5 














Sngl 


73011 


72714 


- 


0 




^ft u U J J ^ J 


/21040 








en 


511 


nex = 


1 






Sng 


73016 


72506 




0 




>4iUUbcSlj 


/11214 






15 














len 


2790 


nex = 


10 






Init 


107144 


107253 


+ 


0 






107333 


107388 


+ 


0 


20 


Intr 


107462 


107594 




0 




Intr 


107672 


107746 


+ 


0 




Intr 


107835 


107950 


+ 


0 




Intr 


108022 


108073 




0 




Intr 


108174 


108225 




0 


2 5 


Intr 


108306 


108363 




0 




Intr 


108456 


108554 




0 




Term 


108635 


108985 








>40068 15 


736277 






30 














len = 


2791 


nex = 


10 






Init 


107144 


107253 


+ 


0 






107333 


107388 




0 




Intr 


107462 


107594 


+ 


0 




Intr 


107672 


107746 


+ 


0 




Intr 


107835 


107950 


+ 


0 




Intr 


108022 


108073 


+ 


0 




Intr 


108174 


108225 




0 


40 


Intr 


108306 


108363 


+ 


0 




Intr 


108456 


108554 


+ 


0 




Term 


108635 


108992 


+ 


0 




>400681S 


/10529 






45 














len 


1092 


nex = 


0 






>4006815 


/2886 






50 


len = 


624 


nex = 


1 






Sngl 


114766 


115389 


+ 


0 




>4006815 


736965 






55 














len = 


2350 


nex = 


9 





60 



Init 13554 13743 
Intr 13824 13879 
Intr 13954 14112 



Reference No. 2750-942P 



Intr 14196 14295 

Intr 14381 14523 

Intr 14714 14819 

Intr 15089 15227 

5 Intr 15323 15591 

Term 15672 15895 

>4006815 /2240 

10 len = 2135 nex = 



15 



Init 


18070 


18214 


Intr 


18297 


18391 


Intr 


18473 


18554 


Intr 


18619 


18681 


Intr 


18791 


18845 


Intr 


18932 


19014 


Intr 


19321 


19416 


Intr 


19606 


19665 


Term 


19874 


20200 



>4006815 /18652 

len = 2128 nex = 

25 

Init 18079 18214 

Intr 18297 18391 

Intr 18473 18554 

Intr 18619 18715 

30 Intr 18791 18845 

Intr 18932 19014 

Intr 19321 19416 

Intr 19606 19665 

Term 19874 20202 

35 

>4006815 /29981 

len = 550 nex = 

40 Sngl 47608 47066 

>4006815 /12018 

len = 594 nex = 

45 

Sngl 48483 47890 

>4006815 733863 

50 len = 207 nex = 

Sngl 53635 53429 

>4006815 /6541 

55 

len = 1765 nex = 

Term 54408 53438 

Init 55202 54521 

60 



Reference No. 2750-942P 



>4006885 /38141 

len = 1902 nex = 

5 Init 102102 102792 

Intr 103399 103488 

Term 103569 103636 

>4006885 /21999 

10 

len = 1291 nex = 

Init 107190 107560 

Term 107858 108480 

15 

>4006885 /143475 

len = 116 3 nex = 

20 Term 113830 113699 

Init 114861 114488 

>4006885 736845 

25 len = 1815 nex = 

Init 127608 127748 

Intr 127830 127937 

Intr 128025 128163 

30 Intr 128267 128437 

Intr 128516 128756 

Intr 128833 128924 

Term 129014 129422 

35 >4006885 /95065 

len = 238 nex = 

Sngl 143397 143160 

>4006885 /40968 

len = 1059 nex = 

45 Term 166624 166434 

Intr 166800 166712 

Intr 166946 166883 

Intr 167137 167035 

Intr 167317 167214 

50 Init 167492 167421 



>4006885 /36701 

len = 1410 nex = 

Sngl 26044 24635 

>4006885 /30175 



40 



55 



6 0 len = 



16 65 nex = 



Reference No. 2750-942P 



Init 
Intr 
Intr 
Intr 
Term 

>4006885 
len = 

>4006885 

len = 

Term 
Intr 
Intr 
Init 

>4006885 

len = 

Term 
Init 



27294 27381 

27469 27545 

27672 27749 

27847 27963 

28043 28323 

/17161 

564 nex = 

739745 



31714 31054 

31984 31812 

32362 32067 

32732 32435 

/35084 

1873 nex = 

43430 42975 

44847 44183 



>4006885 

len = 

Init 
Intr 
Intr 
Term 

>4027862 

len = 

Init 
Intr 
Term 

>4027862 

len = 

Init 
Intr 
Intr 
Intr 
Term 

>4027862 

len = 



1002 nex = 

74420 74452 

74579 74714 

74988 75055 

75142 75421 

/13725 

16 30 nex = 

13479 13828 
14535 14558 
14658 15106 

737455 

17 92 nex = 

19304 19667 

19757 19881 

20420 20511 

20639 20725 

20814 21095 

714541 

1376 nex = 



Init 39417 39585 
60 Intr 39658 39884 



Reference No. 2750-942P 





Intr 


40036 


40369 


+ 


0 






40452 


40792 


+ 


0 




>4038029 


/12043 






5 














len = 


1878 




11 






Term 


23390 


23308 


_ 


0 




Intr 


23553 


23476 


_ 


0 


10 


Intr 


23748 


23695 


_ 


0 




Intr 


23896 


23833 


- 


0 




Intr 


24208 


24163 








Intr 


24367 


24295 


- 


0 




Intr 


24518 


24447 






15 


Intr 


24700 


24608 


- 


0 




Intr 


24842 


24790 








Intr 


25011 


24921 




0 




Init 


25171 


25089 




0 


20 


>4038029 


/12568 








len = 


2394 


nex = 


11 






Term 


23390 


23002 


_ 


0 


25 


Intr 


23553 


23476 


_ 


0 




Intr 


23748 


23695 


_ 


0 




Intr 


23896 


23833 


_ 


0 




Intr 


24208 


24163 


_ 


0 




Intr 


24367 


24295 


- 


0 


30 


Intr 


24518 


24447 








Intr 


24700 


24608 


- 


0 




Intr 


24842 


24790 




0 






25011 


24921 




0 




Init 


25395 


25089 


- 


0 


35 














>4038029 


/31175 








len = 


719 


nex = 


2 




40 


Term 


41247 


40834 




0 




Init 


41552 


41441 


- 


0 




>4038029 


/205695 






45 


len = 


575 


nex = 


2 








46021 


45667 




0 




Init 


46241 


46125 


- 


0 


50 


>4038029 


7364 








len = 


598 


nex = 


2 






Term 


46021 


45663 




0 


55 


Init 


46260 


46125 




0 




>4038029 


76995 








len = 


767 


nex = 


2 





Reference No. 2750-942P 



Term 57543 57164 

init 57930 57799 

>4038029 /9804 

5 

len = 2193 nex = 

Init 65054 65282 

Intr 65599 65995 

10 Term 66507 66778 

>4049332 /31275 

len = 1090 nex = 

15 

Sngl 36708 37796 

>4049332 /12232 

20 len = 1707 nex = 

Init 38921 39520 

Intr 39710 39871 

Term 40057 40627 

25 

>4049332 /31460 

len = 1016 nex = 

30 Init 39710 39871 

Term 40057 40719 

>4049332 737529 

35 len = 2144 nex = 

Term 63854 63126 

Intr 64463 63966 

Init 65269 65079 

40 

>4049332 /34310 

len = 1773 nex = 

45 Term 70001 69714 

Intr 70410 70363 

Intr 71182 71025 

Init 71486 71294 

50 >4049332 /28012 

len = 1066 nex = 

Init 77171 77322 

55 Term 77644 78233 

>4049332 /16844 

len = 957 nex = 

60 



Reference No. 2750-942P 



Terra 90208 89783 

Intr 90525 90453 

Init 90723 90633 

5 >4056429 /33126 

len = 1510 nex = 

Init 10968 11580 

10 Intr 11710 11780 

Intr 11934 12053 

Intr 12141 12259 

Term 12393 12475 

15 >4056429 73465 

len = 1407 nex = 

Term 18080 17627 

20 Intr 18319 18176 

Intr 18697 18419 

Init 19033 18817 

>4056429 /26826 

25 len = 1648 nex = 

Term 27443 26816 

Intr 27669 27526 

Intr 27875 27753 

30 Intr 28125 27964 

Init 28463 28218 

>4056429 792839 

35 len = 1320 nex = 

Term 29933 29790 

Intr 30139 30074 

Intr 30433 30254 

40 Init 30662 30509 

>4056429 76245 

len = 1515 nex = 

45 

Term 33847 33332 

Intr 34084 33941 

Intr 34319 34185 

Intr 34571 34422 

50 Init 34846 34705 

>4056429 710721 

len = 2590 nex = 

55 

Term 47922 47487 

Intr 48181 48035 

Intr 49877 49731 

Init 50069 49958 



Reference No. 2750-942P 



>4056429 
len = 
SngL 

>4056429 

len = 

Init 
Term 

>4056429 

len = 

Init 
Intr 
Intr 
Intr 
Intr 
Intr 
Intr 
Intr 
Term 

>4056476 
len = 
Sngl 

>4056476 

len = 

Init 
Intr 
Intr 
Term 

>4056476 
len = 
Sngl 

>4056476 

len = 

Term 
Intr 
Intr 
Intr 
Intr 
Intr 
Intr 
Intr 



736437 
1094 nex = 
59727 60545 

/15450 

1313 nex = 

74846 75064 
75185 75669 

/18845 

2571 nex = 

94527 94779 

94873 94949 

95041 95163 

95819 95923 

96020 96092 

96192 96263 

96369 96521 

96649 96772 

96860 97097 

/31973 

57 0 nex = 

105977 106546 

/39206 

1908 nex = 

107066 107469 

107817 107846 

107898 108112 

108463 108973 

/205753 



107310 107212 
/98881 



1823 



nex = 



9508 9155 

9685 9625 

9877 9786 

10054 9975 

10183 10147 

10330 10271 

10462 10417 

10672 10540 



Reference No. 2750-942P 



Init 10977 10757 



>4056476 /15831 



1631 



Init 
Intr 
Intr 
Term 



11178 
11809 
12030 
12389 



11501 
11949 
12302 
12808 



len = 1479 nex = 

15 

Term 112566 112276 

Intr 112710 112642 

Init 113754 113600 

20 >4056476 729782 



len = 1167 nex = 



Term 17151 17095 

25 Intr 17449 17246 

Intr 17819 17548 

Init 18261 18164 



>4056476 

30 

len = 

Term 
Intr 

3 5 Init 
>4056476 
len = 

40 

Term 
Intr 
Intr 
Intr 

45 Intr 
Init 

>4056476 

50 len = 

Term 
Intr 
Intr 

55 Intr 
Intr 
Init 



78397 

1454 nex = 

17151 16982 

17449 17246 

17819 17548 

733020 

1690 nex = 

39931 39868 

40166 40071 

40433 40246 

40634 40539 

40814 40775 

41554 41439 

7113469 

2530 nex = 

39931 39376 

40166 40071 

40433 40246 

40634 40539 

40814 40775 

41552 41439 



>4056476 

60 



743008 



Reference No. 2750-942P 



Len = 2506 nex = 



Term 


39931 


39406 


Intr 


40166 


40071 


Intr 


40433 


40246 


Intr 


40634 


40539 


Intr 


40814 


40775 


Init 


41552 


41439 



10 >4056476 /8849 



Term 

15 Intr 
Intr 
Init 

>4056476 

.. 2 0 

len = 

Term 
Intr 

2 5 Intr 
Init 



4259 4148 

4695 4357 

5068 4967 

5605 5183 

/36303 

1786 nex = 

4259 4148 

4695 4357 

5068 4967 

5616 5183 



1305 

6 

0 
0 
0 
0 
0 
0 



4 

0 
0 
0 
0 



4 

0 
0 
0 
0 



>4056476 /9001 

30 len = 1248 nex = 

Sngl 66616 65369 

>4056476 /16393 

35 

len = 2532 nex = 

Term 96093 95777 

Intr 96329 96178 

40 Intr 96583 96422 

Intr 97005 96654 

Intr 97159 97097 

Init 98308 98024 

45 >4063730 /13660 

len = 1755 nex = 

Init 60286 60642 

50 Intr 61422 61490 

Term 61594 62040 

>4063730 /21240 

55 len = 2170 nex = 

Init 72484 73175 

Intr 73451 73571 

Intr 73660 73844 

60 Intr 73922 74053 



Reference No. 2750-942P 





Term 


74132 


74268 




>4 utD J / J U 


/38879 




len 


2303 


nex = 




Term 


91140 


90858 




Intr 


91393 


91280 




Intr 


91609 


91538 


10 


Intr 


92030 


91961 




Intr 


92214 


92117 




Intr 


92495 


92294 




Inxt 


93160 


92726 


15 


>4063735 


/1351 




len = 


2263 


nex = 




Init 


105458 


105582 


20 


Intr 


106342 


106839 




Term 


106936 


107720 



>4063735 

25 len = 

Init 
Term 

30 >4063735 
len = 
Sngl 

35 

>4063735 
len = 
4 0 Sngl 
>4063735 
len = 



45 



Term 
Init 



>4063735 
len = 



736464 

1399 nex = 

106315 106839 
106936 107713 

/110447 

659 nex = 

107104 107748 

799323 

1093 nex = 

15861 16953 

/148671 

1079 nex = 

43823 43650 
44402 44347 

729766 

1949 nex = 



Term 


97324 


96849 


Intr 


97547 


97491 


Intr 


97772 


97630 


Intr 


98263 


98128 


Intr 


98642 


98446 


Init 


98797 


98734 



60 >4063737 



Reference No. 2750-942P 



1307 











8 






Term 


8947 


8682 




0 


5 


Intr 


9161 


9108 


- 


0 




Intr 


9393 


9318 




0 




Intr 


9521 


9490 


_ 


0 




Intr 


9711 


9616 




0 




Intr 


9843 


9804 




0 


10 


Intr 


9992 


9931 




0 




Init 


10722 


10413 




0 




>4 063 7 3 7 


737252 






15 


len 


2858 


nex = 


12 






Term 


33875 


33630 




0 






34209 


33988 




0 




Intr 


34433 


34313 




0 


20 


Intr 


34722 


34531 


- 


0 




Intr 


34908 


34836 


: 


0 




Intr 


35127 


35004 




0 




Intr 


35284 


35208 




0 




Intr 


35494 


35372 




0 


25 


Intr 


35672 


35567 


- 


0 




Intr 


35825 


35763 




0 






36084 


35913 




0 




Init 


36487 


36329 


- 


0 


30 


>4063737 


736243 








len = 


501 


nex = 








Sngl 


56605 


57105 


+ 


0 


35 














>4063737 


713761 








len = 


2567 


nex = 


2 




40 


Init 


61983 


62535 


+ 


0 




Term 


63541 


64549 


+ 


0 




>4063737 


717020 






45 


len = 


2217 


nex = 


2 






Init 


74589 


74633 




0 




Term 


74734 


75239 


+ 


c 


50 


>4063756 


7154687 










1056 


nex = 


3 






Init 


37451 


37699 


+ 


( 


55 


Intr 


38169 


38269 


+ 


( 




Term 


38354 


38506 


+ 


( 




>4063755 


715276 






60 


len = 


2125 




4 





Reference No. 2750-942P 



Init 
Intr 
Intr 

5 Term 
>4079614 
len = 

10 

Sngl 
>4079614 
15 len = 

Sngl 
>4079614 

20 

len = 
Sngl 

25 >4079614 
len = 
Sngl 

30 

>4079614 

len = 

35 Init 
Intr 
Intr 
Intr 
Term 

40 

>4079614 
len = 
45 Sngl 
>4092471 
len = 

50 

Sngl 

>4092471 

55 len = 

Term 
Intr 
Init 



37451 37699 

38169 38269 

38354 38547 

38976 39575 

/9756 

974 nex = 

27761 28734 

/36980 

1107 nex = 

38778 37672 

/18761 

250 nex = 

64013 63770 

739763 

10 3 0 nex = 

73340 74368 

/6208 

254 8 nex = 

75720 75893 

75986 76102 

76525 75695 

76777 76989 

77081 77542 

/4512 

490 nex = 

96496 96984 

722446 

506 nex = 

11405 11769 

7103691 

1759 nex = 

42647 42351 
43888 43766 
43998 43927 



Reference No. 2750-942P 



>4092471 

len = 

5 Term 
Intr 
Init 

>4092472 

10 

len = 
Sngl 

15 >4092472 
len = 
Term 

2 0 Intr 
Intr 
Intr 
Init 

25 >4092472 
len = 
Term 

30 Intr 
Intr 
Intr 
Intr 
Intr 

35 Init 
>4096078 
len = 

40 

Sngl 

>4096078 

4 5 len = 

Term 
Intr 
Intr 

50 Intr 
Intr 
Intr 
Intr 
Intr 

55 Intr 
Init 

>4096078 

60 len = 



/20391 

1615 nex = 

42647 42495 
43888 43765 
43998 43927 

/16473 

1615 nex = 

20093 20396 

/3126 

28 65 nex = 

23958 23048 

24223 24110 

24606 24316 

24981 24673 

25912 25060 

/104778 

3719 nex = 

46430 46119 

46578 46521 

46706 46670 

47519 47278 

47978 47862 

48234 48193 

49837 49522 

/155459 

415 nex = 

42191 42304 

/37034 

19 48 nex = 

45801 45724 

45988 45893 

46149 46063 

46287 46219 

46556 46369 

46697 46640 

46828 46769 

46936 46901 

47134 47023 

47376 47303 

/21724 

2303 nex = 



1309 



0 
0 
0 



+ 0 



5 

0 
0 
0 
0 
0 



7 

0 
0 
0 
0 
0 
0 
0 



1 

+ 0 



10 

0 
0 
0 
0 
0 
0 
0 
0 
0 
0 



Reference No. 2750-942P 



1310 



Term 


45801 


45724 


- 


0 


Intr 


45988 


45893 


- 


0 


Intr 


46149 


46063 




0 


Intr 


46287 


46219 




0 


Intr 


46556 


46369 




0 


Intr 


46697 


46640 




0 


Intr 


46828 


46769 




0 


Intr 


46936 


46901 




0 


Intr 


47134 


47023 




0 


mit 


47599 


47303 




0 



>4096078 /15956 
15 len = 1336 nex = 



Init 


54120 


54197 


+ 


0 


Intr 


54285 


54356 


+ 


0 


Intr 


54439 


54543 


+ 


0 


Intr 


54618 


54710 


+ 


0 


Intr 


54798 


54890 


+ 


0 


Intr 


54983 


55106 


+ 


0 


Term 


55206 


55327 


+ 


0 



25 >4096078 /8301 



len = 1697 nex = 



Term 


55725 


55556 


0 


Intr 


55873 


55814 


0 


Intr 


56063 


55943 


0 


Intr 


56218 


56142 


0 


Intr 


56400 


56306 


0 


Intr 


56619 


56513 


0 


Init 


57252 


56708 


0 



>4096078 /20936 

len = 1419 nex = 

40 

Term 58407 58007 

Intr 58821 58569 

Init 59425 59245 

45 >4096078 /11587 

len = 698 nex = 

Sngl 62604 61911 

50 

>4115352 /125955 

len = 1433 nex = 

55 Term 26979 26916 

Intr 27312 27194 

Init 27672 27485 

>4115352 /19338 

60 



Reference No. 2750-942P 



len = 1003 nex = 

Term 26979 26681 

Intr 27312 27194 

5 Init 27683 27485 

>4115352 /41682 

len = 1450 nex = 

10 

Term 27312 27194 

Init 27690 27485 

>4115370 /104717 

15 





len = 


1480 


nex = 




Sngl 


21740 


21519 


20 


>4115370 


/117533 




len = 


1330 


nex = 




Term 


43085 


42601 


25 


Intr 


43330 


43160 




Intr 


43471 


43406 




Intr 


43806 


43554 




Init 


43929 


43880 


30 


>4115370 


/40540 




len = 


2671 


nex = 




Term 


86403 


86022 


35 


Intr 


86600 


86484 




Intr 


87059 


86681 




Intr 


87301 


87146 




Intr 


87515 


87379 




Intr 


87708 


87589 


40 


Intr 


87929 


87801 




Intr 


88216 


88023 




Init 


88692 


88422 



>4115912 /2343 

45 

len = 2183 nex = 

Term 15312 14953 

Intr 15468 15407 

50 Intr 15691 15575 

intr 15840 15781 

Intr 16048 15936 

intr 16206 16134 

intr 16322 16278 

55 Intr 16522 16412 

Init 17135 16691 

>4115912 /11730 



60 len = 896 nex = 



Reference No. 2750-942P 



init 34905 35153 

Term 35318 35800 

5 >4115912 78445 

len = 1254 nex = 

Init 95230 95390 

10 intr 95475 95562 

intr 95755 95879 

Term 95956 96483 

>4115930 /40478 

15 

len = 1543 nex = 

Term 41954 41542 

Init 43084 42069 

20 

>4115930 /13513 





len = 


1185 


nex = 


25 


Init 


46052 


46100 




Intr 


46383 


46693 




Intr 


46778 


46829 




Term 


47059 


47236 


30 


>4115930 


/39698 




len = 


1656 


nex = 




Init 


53288 


53526 


35 


Intr 


54255 


54411 




Term 


54517 


54943 




>4115930 


79218 


40 


len = 


1162 


nex = 




Init 


59597 


59737 




Intr 


60190 


60346 




Term 


60446 


60758 


45 










>4115930 


73033 




len = 


2350 


nex = 


50 


Term 


75735 


75445 




Intr 


75948 


75854 




Intr 


76111 


76037 




Intr 


76336 


76251 




Intr 


76789 


76738 


55 


Intr 


77027 


76953 




Intr 


77177 


77111 




Init 


77787 


77261 



>4159699 71558 

60 



Reference No. 2750-942P 



Term 
Init 

>4159699 

len = 

Init 
Intr 
Intr 
Term 

>4159700 

len = 

Init 
Intr 
Intr 
Intr 
Intr 
Intr 
Intr 
Intr 
Intr 
Intr 
Term 

>4159700 

len = 

Term 
Init 

>4159700 

len = 

Sngl 

>4159700 

len = 

Init 
Term 

>4159700 

len = 

i Init 
Term 



1272 nex = 

14823 14303 

15574 15206 

/17101 

2330 nex = 

8693 8791 

9044 9107 

9186 9395 

9532 10243 



/11277 



2170 

2785 
3119 
3301 
3481 
3634 
3811 
4060 
4305 
4488 
4639 
4715 



3024 
3211 
3401 
3547 
3733 
3924 
4197 
4389 
4537 
4703 
4949 



1278 nex = 

33490 33138 
33894 33573 

/106065 

550 nex = 

36 578 

/11225 

19 5 9 nex = 

5625 6382 
6789 7074 

/37830 

2006 nex = 

5116 6382 
6789 7121 



60 len = 



19 9 0 nex = 



2 



Reference No. 2750-942P 



Init 
Term 

5 >4159702 

len = 

Term 

10 Intr 
Intr 
Init 

>4159703 

15 

len = 

Init 
Intr 

2 0 Intr 

Intr 
Term 

>4159704 

25 

len = 

Term 
Intr 

3 0 Intr 

Init 

>4159704 

3 5 len = 

Term 
Intr 
Intr 

40 Init 
>4159704 
len = 

45 

Init 
Term 

>4159704 

50 

len = 

Init 
Intr 

55 Intr 
Intr 
Intr 
Intr 
Term 

60 



5625 6382 

6789 7121 

739782 

1769 nex = 

19118 18896 

19546 19462 

19766 19651 

20664 20383 

722343 

2215 nex = 

7241 7578 

8231 8293 

8510 8587 

8667 8829 

8902 9455 

713704 

1335 nex = 

11416 11205 

11620 11507 

11891 11712 

12357 12143 

721828 

1419 nex = 

11416 11151 

11620 11507 

11891 11712 

12357 12143 

7109598 

1100 nex = 

5940 6324 

6476 7039 

739832 

2092 nex = 

71302 71572 

71670 71791 

71963 72067 

72212 72304 

72593 72655 

72773 72882 

73080 73393 



Reference No. 2750-942P 



>4159704 

len = 

5 Ini-t 
Intr 
Term 

>4159705 

10 

len = 

Init 
Intr 

15 Term 
>4159705 
len = 

20 

Sngl 
>4159705 
25 len = 

Sngl 
>4159705 

30 

len = 
Sngl 

35 >4159705 
len = 
Init 

40 Intr 
Intr 
Intr 
Intr 
Intr 

45 Term 
>4159706 
len = 

50 

Sngl 
>4159706 
5 5 len = 

Sngl 



/13305 

2400 nex = 

80000 81012 
81347 81522 
81623 82399 

/9508 

163 9 nex = 

28709 28753 
29227 29324 
29566 29821 

/114725 

490 nex = 

73649 73181 

/109298 

730 nex = 

73980 73255 

/326 

1716 nex = 

74641 73196 

737994 

2303 nex = 

77530 77634 

77709 77959 

78103 78229 

78328 78601 

78834 78966 

79353 79439 

79694 79832 

/112601 

6 90 nex = 

10748 11437 

/16938 

133 0 nex = 

14792 13472 



>4159706 /30320 

60 



Reference No. 2750-942P 



Sngl 
5 >4159706 
len = 
Term 

10 Init 
>4159705 
len = 

15 

Term 
Intr 
Intr 
Init 

20 

>4159706 

len = 

25 Init 
Intr 
Intr 
Intr 
Intr 

3 0 Intr 
Term 

>4159706 

35 len = 

Sngl 

>4159706 

40 

len = 
Sngl 

45 >4I59706 
len = 
Init 

5 0 Term 
>4159706 
len = 

55 

Init 
Intr 
Intr 
Intr 



370 nex = 
28894 28532 
/112749 
712 nex = 

35875 35547 
36258 35970 

/8290 

1320 nex = 

35876 35548 
36256 35970 
36422 36349 
36867 36752 

735999 

1933 nex = 

53834 53966 

54115 54280 

54489 54541 

54877 54939 

55266 55316 

55381 55498 

55599 55766 

/4581 

1390 nex = 

58788 60171 

738468 

1437 nex = 

65398 66834 

735981 

1281 nex = 

71773 71993 
72232 73053 

727978 

1451 nex = 

74975 75143 

75510 75632 

75728 76041 

76128 76180 

76272 76425 



Reference No. 2750-942P 



>4159707 
len = 

5 

>4159707 
len = 
10 Sngl 
>4159707 
len = 

15 

Sngl 

>4159707 

2 0 len = 

Init 
Term 

25 >4159707 
len = 
Sngl 

30 

>4159707 
len = 
35 Sngl 
>4159707 
len = 

40 

Sngl 
>4159707 
4 5 len = 

Sngl 
>4159707 
len = 



50 



Init 
Intr 

5 5 Term 
>4159707 
len = 



/207156 

92 6 nex = 

/41828 

875 nex = 

15648 15958 

729375 

7 92 nex = 

15648 15949 

/37081 

827 nex = 

15176 15533 
15648 16002 

/1B951 

59 5 nex = 

1560 2154 

/32148 

1343 nex = 

16092 17434 

/25313 

1016 nex = 

16466 17481 

/41182 

593 nex = 

16874 17456 

/287B2 

1096 nex = 

27447 27699 
27790 28154 
28252 28542 

797742 

12 20 nex = 



Reference No. 2750-942P 





Init 


2818 3153 


+ 


0 




Term 


3582 4037 


+ 


0 




>4159707 


/42151 






5 












len = 


1186 nex = 


2 






Init 


2848 3153 


+ 


0 




Term 


3582 4033 


+ 


0 


10 












>4159707 


72356 








len = 


2110 nex = 


7 




15 


Term 


28857 28597 




0 




Intr 


29050 28952 


- 


0 




Intr 


29395 29294 




0 




Intr 


29617 29493 


- 


0 




Intr 


29826 29753 




0 


20 


Intr 


30352 30211 


- 


0 




Init 


30705 30436 




0 




>4159707 


/21101 






25 


len = 


310 nex = 








Sngl 


41123 40815 


- 


0 




>4159707 


/106867 






30 












len = 


1340 nex = 


2 






Term 


41164 40724 


_ 


0 




Init 


41458 41253 


- 


0 


35 












>4159707 


/22328 








len = 


1515 nex = 


3 




40 


Term 


41164 40703 








Intr 


41458 41253 


- 


0 




Init 


42217 42021 




0 




>4159707 








45 












len = 


19 0 nex = 








Sngl 


42223 42042 


- 


0 


50 


>4159707 


/38214 








len = 


182 2 nex = 








Init 


58012 58329 


+ 


( 


55 


Intr 


58492 58799 


+ 






Term 


58885 59833 


+ 






>4159707 


/18979 






60 


len = 


1890 nex = 


3 





Reference No. 2750-942P 



Init 
Intr 
Term 

>4159707 
len = 
Sngl 

>4159707 
len = 



Term 
Init 

>4159707 

len = 

Init 
Intr 
Term 

>4159708 

len = 

Init 
Term 

>4159708 

len = 

Init 
Intr 
Intr 
Intr 
Term 

>4159708 
len = 
Sngl 

>4159709 

len = 

Init 
Intr 
Term 



58012 58329 
58492 58799 
58885 59901 

/26705 

53 8 nex = 

63569 54106 

/15303 

14 93 nex = 

68076 67448 
68653 68151 

/94231 

1630 nex = 

78497 78952 
79546 79639 
79732 80120 

735429 

7 64 nex = 

44910 45015 
45368 45673 

/16564 

1359 nex = 



61816 
62321 
62462 
62641 
62943 



62242 
62370 
62549 
62757 
63174 



739933 

1197 nex = 

76529 77725 

78209 

1632 nex = 

28608 28689 
28882 28962 
29255 29368 



60 len = 2026 nex = 



Reference No. 2750-942P 



Init 


28608 


28689 


+ 


0 


Intr 


28882 


28962 


+ 


0 


Intr 


29255 


29368 


+ 


0 


Intr 


29463 


29564 


+ 


0 


Intr 


29876 


29934 


+ 


0 


Intr 


30031 


30133 




0 


Intr 


30228 


30293 


+ 


0 


Term 


30387 


30479 


+ 


0 



Term 
Intr 
Init 



Term 
Intr 

2 5 Init 
>4159709 
len = 

30 

Init 
Intr 
Intr 
Intr 

35 intr 
Intr 
Intr 
Term 

40 >4159709 

len = 

Sngl 

>4159709 

len = 

50 Term 
Init 

>4159709 

55 len = 



45 



Init 
Term 



19 90 nex = 

1570 1249 
2549 2064 
3231 2810 

/13261 

2987 nex = 

37003 36228 
38163 37674 
39214 38597 

76579 

2352 nex = 

70836 71356 

71493 71667 

71764 71913 

72061 72222 

72304 72489 

72587 72709 

72784 72873 

72972 73187 

79155 

557 nex = 

75021 75577 

736834 

657 nex = 

76162 75717 
76373 76339 

7115021 

515 nex = 

82749 82901 
83001 83063 



60 >4159710 713184 



Reference No. 2750-942P 



1870 nex ■■ 



Term 
Intr 
Intr 
Intr 
Intr 
Intr 
Init 

>4159710 

Sngl 

>4159711 

len = 

Term 
Intr 
Init 



14387 
14579 
14816 
14992 
15289 
15462 
15947 



14080 
14479 
14659 
14930 
15150 
15369 
15591 



/100272 

67 0 nex = 

18553 19215 

/18503 

1595 nex = 

15462 15206 
15812 15748 
16800 16583 



>4159712 
len = 
Sngl 

>4159712 

len = 

Term 
Intr 
Intr 
Intr 
Intr 
Intr 
Init 

>4159712 

len = 

Term 
Intr 
Init 

>4159712 

len = 

Init 
Term 



/4044 
7 30 nex = 
10754 10446 
/36591 
2809 nex = 



22587 
22740 
23033 
23401 
23755 
23937 
25023 



22215 
22669 
22830 
23272 
23529 
23881 
24745 



/14490 

10 90 nex = 

74186 73691 
74429 74331 
74774 74516 

/1226 

794 nex = 



8922 
9221 



9122 
9715 



>4165340 

60 



/12055 



Reference No. 2750-942P 



len = 
Sngl 

5 >4165340 
len = 
Init 

10 Intr 
Term 

>4165340 

15 len = 

Init 
Intr 
Term 

20 

>4165340 

len = 

25 Init 
Intr 
Intr 
Term 

30 >4165340 
len = 
>4165340 

35 

len = 
>4165340 
40 len = 

Sngl 
>4165340 

45 

len = 

Init 
Intr 

5 0 Intr 
Intr 
Intr 
Intr 
Intr 

55 Intr 
Intr 
Term 



52 0 nex = 

105172 104653 

/94038 

217 0 nex = 

3022 3136 
3876 3994 
4100 4352 

/19364 

2182 nex = 

3022 3136 
3876 3994 
4100 4372 

/38840 

2072 nex = 

2281 2871 

3022 3136 

3876 3994 

4100 4352 

/19740 

2030 nex = 

/11470 

1919 nex = 

/31383 

655 nex = 

6708 6054 

/16005 

2365 nex = 

84598 84682 

84770 84891 

84993 85214 

85301 85485 

85563 85698 

85805 85901 

85988 86088 

86193 86356 

86443 86514 

86712 86828 



>4165340 

60 



/17788 



Reference No. 2750-942P 



1323 





len = 


397 


nex = 


1 






S 1 
ng 


97179 


97575 


+ 


0 


5 


>4 18512 0 


/6390 








len = 


3432 


nex = 


13 






Term 


15226 


14939 


_ 


0 


10 


Intr 


15428 


15316 


_ 


0 




Intr 


15588 


15539 


_ 


0 




Intr 


15779 


15709 


_ 


0 




Intr 


15948 


15879 


_ 


0 




Intr 


16280 


16183 


- 


0 


15 


Intr 


16473 


16407 




0 




Intr 


16656 


16579 


- 


0 




Intr 


16926 


16759 




0 




Intr 


17196 


17029 


- 


0 




Intr 


17587 


17291 




0 


2 0 




17910 


17665 




0 




Init 


18370 


18221 


- 


0 




>4185120 


/96856 






25 


len = 


646 


nex = 


2 








42352 


41925 




0 




Init 


42557 


42437 


_ 


0 


30 


>4185120 


/25812 








len = 


761 


nex = 


1 






Sngl 


99742 


98982 


- 


0 


35 














>4185128 


792525 








len = 


1425 


nex = 


2 




40 


Init 


13244 


13883 


+ 


0 




Term 


14344 


14668 


+ 


0 




>4185128 


/16416 






45 


len = 


1248 




1 






Sngl 


17454 


18701 


+ 


0 




>4185128 


/17464 






50 














len = 


1840 


nex = 


5 





Term 


32604 


32490 


0 


Intr 


32905 


32692 


0 


Intr 


33672 


32987 


0 


Intr 


33912 


33755 


0 


Init 


34329 


34123 


0 



>4185128 726663 

60 



Reference No. 2750-942P 





len = 


1076 


nex = 


4 






Init 


38740 


39208 


+ 


0 






39413 


39478 






5 


Intr 


39529 


39651 


+ 


0 




Term 


39743 


39815 


+ 


0 




>4185128 


/24619 






1 0 




1711 


nex = 










43094 


43248 








Intr 


43475 


43537 


+ 


0 




Intr 


44259 


44337 






15 


Intr 


44425 


44470 


+ 


0 




Term 


44574 


44804 


+ 


0 




>4185128 


/14084 






20 


len = 


1815 


nex = 


6 






Init 


54517 


54995 


+ 


0 




Intr 


55165 


55247 


+ 


0 




Intr 


55403 


55492 


+ 


0 


25 


Intr 


55577 


55725 








Intr 


55817 


55948 


+ 


0 




Term 


56044 


56331 








>4185128 


/4231 






30 














len = 


1128 


nex = 








Sngl 


57987 


56860 


- 


0 


35 


>4185128 


737734 








len = 


1154 


nex = 


1 






Sngl 


7894 


6748 


_ 


0 


40 














>4191760 


735366 








len = 


2418 


nex = 


7 




45 


Term 


51175 


50955 








Intr 


51401 


51262 


- 


0 




Intr 


51669 


51495 








Intr 


51922 


51764 


- 


0 




Intr 


52401 


52256 






50 


Intr 


52598 


52509 


_ 


0 




Init 


53372 


52795 


- 


0 




>4191760 


/16162 






55 


len = 


3435 


nex = 


7 






Term 


54020 


53640 




0 




Intr 


54593 


54540 




0 




Intr 


54849 


54707 




0 


60 


Intr 


55352 


54943 




0 



Reference No. 2750-942P 



Intr 55948 55819 

Intr 56284 56184 

Init 57074 56736 

5 >4191760 1111^2 

len = 2185 nex = 



Term 
Intr 
Intr 
Intr 
Intr 
Init 



68405 
68552 
68822 
69025 
69410 
70257 



68073 
68485 
68687 
68922 
59139 
69851 



20 Init 70697 70805 

Term 71007 71062 



>4191771 

25 len = 

Term 
Intr 
Intr 

30 Intr 
Intr 
Intr 
Intr 
Intr 

35 Intr 
Intr 
Intr 
Intr 
Intr 

40 Init 



2918 

28814 
28989 
29148 
29326 
29548 
29742 
29885 
30040 
30224 
30405 
30627 
30777 
30887 
31273 



nex = 

28356 
28901 
29085 
29244 
29422 
29658 
29822 
29979 
30154 
30310 
30479 
30708 
30847 
30959 



2857 



nex ■ 



Init 
Intr 
Intr 
Intr 
Intr 
Intr 
Intr 
Intr 
Intr 
Intr 
Intr 
Term 



33231 
33490 
33762 
33987 
34302 
34494 
34638 
34906 
35171 
35298 
35611 
35711 



33385 
33533 
33857 
34079 
34389 
34555 
34822 
35083 
35214 
35514 
35628 
36087 



>4191771 

60 



/35176 



Reference No. 2750-942P 



2743 nex ■■ 



Init 
Intr 
Intr 
Intr 
Intr 
Intr 
Intr 
Intr 
Intr 
Intr 
Term 

>4191771 

len = 

Sngl 

>4191771 

len = 

Sngl 

>4191771 

len = 

Sngl 

>4191771 

len = 

Term 
Intr 
Init 

>4I91771 
len = 
Sngl 

>4199934 

len = 

Init 
Term 

>4199934 

len = 

Sngl 



33285 
33490 
33762 
33987 
34302 
34494 
34638 
34906 
35171 
35298 
35711 



33385 
33533 
33857 
34079 
34389 
34555 
34822 
35083 
35214 
35514 
36017 



/33083 

217 nex = 

54044 54260 

/125285 

447 nex = 

65686 65240 

/152285 

104 5 nex = 

66311 65267 

/19555 

1320 nex = 

71563 70726 
71884 71654 
72045 71968 

726452 

850 nex = 

83459 83331 

729227 

1035 nex = 

19693 19801 
20093 20727 

733136 

33 8 nex = 

20399 20729 



60 >4199934 



7154322 



Reference No. 2750-942P 



len = 

Sngl 

>4199934 

len = 

Sngl 

>4204173 

len = 

Term 
Init 

>4204173 

len = 

Term 
Intr 
Init 

>4217996 
len = 
Sngl 

>4217996 

len = 

Init 
Intr 
Intr 
Intr 
Intr 
Intr 
Intr 
Intr 
Term 

>4218109 

len = 



352 nex = 

23117 23462 

736784 

1133 nex = 

23117 24249 

/41722 

1054 nex = 

21811 21377 
22430 21898 

/14471 

1120 nex = 

23568 23254 
23859 23729 
24373 23953 

/8441 
1482 nex = 
51080 52561 

738968 

23 88 nex = 

52825 53030 

53128 53176 

53267 53355 

53445 53501 

53636 53698 

53788 53853 

53992 54078 

54220 54313 

54408 55212 

7324 

1860 nex = 



Term 


39406 


38929 




Intr 


39731 


39476 




Intr 


40077 


39838 




Intr 


40403 


40276 




Init 


40788 


40484 





len 

6 0 



1690 nex = 



5 



Reference No. 2750-942P 



Init 
Intr 
Intr 
Intr 
Term 

>4218109 

len = 

Term 
Intr 
Intr 
Init 

>42I8109 

len = 



Term 
Init 

>4220468 

len = 

Term 
Intr 
Intr 
Intr 
Init 

>4220468 

Sngl 
>4220468 
len = 
Sngl 
>4220468 
len = 
Sngl 
>4220468 
len = 
>4220468 
len = 



42662 43143 

43303 43336 

43644 43748 

43827 44015 

44115 44344 

/36080 

1657 nex = 

53929 53688 

54126 54076 

54789 54211 

55344 55008 

738545 

654 nex = 

89772 89411 
90064 89883 

/41875 

1728 nex = 



25333 
25490 
25978 
26212 
26477 



24750 
25414 
25767 
26057 
26297 



/917 
599 nex = 
26654 27252 
/25454 
310 nex = 
42788 42485 
/19157 
1214 nex = 
43720 42507 
739244 
2268 nex = 

78247 
1486 nex = 



Init 61354 61713 
60 Intr 61815 61920 



Reference No. 2750-942P 



Intr 
Intr 
Term 



62016 62099 
62194 62322 
62414 62839 



>4220468 
len = 



1120 



nex = 



Term 
Intr 
Intr 
Intr 
Init 



75604 
75828 
76086 
76234 
76442 



75323 
75680 
76037 
76176 
76342 



>4220468 

len = 

Term 
Intr 
Init 



/104853 

1161 nex = 

80528 80150 
80846 80610 
81310 80910 



>4220510 
len = 



733382 
1810 nex = 



Term 
Init 



119483 118560 
120362 119827 



>4220510 

len = 

Term 
Init 

>4220510 

len = 

Init 
Intr 
Term 

>4220627 

len = 



/39461 
1133 nex 



62775 
63410 



62278 
63166 



/36681 

2001 nex = 

8869 9090 
9370 10119 
10211 10869 

/11357 

67 0 nex = 



Term 
Init 



23338 22980 
23642 23419 



>4220627 
len = 



Init 
Term 



/1267 
17 98 nex ' 



30233 30321 
30435 32030 



>4220630 

60 



/10008 



Reference No. 2750-942P 



nex ■■ 



Term 
Intr 
Intr 
Intr 
Intr 
Intr 
Intr 
Intr 
Init 



16866 
17015 
17288 
18114 
18317 
18553 
18722 
18974 
19138 



16556 
16956 
17193 
18032 
18255 
18457 
18641 
18880 
19050 



>4220630 
len = 



Term 
Init 



3819 3692 
4805 4216 



736799 
2 8 43 nex 



Init 

2.5 Intr 
Intr 
Intr 
Intr 
Intr 

3 0 Intr 

Intr 
Intr 
Term 

35 >4220631 
len = 
Sngl 

40 

>4220631 
len = 

4 5 Init 

Intr 
Intr 
Intr 
Intr 

5 0 Intr 

Intr 
Term 



27324 
27591 
27785 
27952 
28143 
28314 
28670 
28922 
29085 
29308 



27510 
27698 
27877 
28044 
28228 
28592 
28803 
29014 
29202 
29650 



550 nex = 
29083 29202 



3328 

32801 
33757 
34429 
34840 
35182 
35373 
35509 
35716 



nex = 

32924 
33803 
34757 
34922 
35257 
35437 
35604 
36125 



>4220631 
len = 



Init 43315 43344 
Term 43425 43740 

60 



Reference No. 2750-942P 



>4220631 

len = 

Sngl 

>4220631 

len = 

Sngl 

>4220632 

len = 

Init 
Intr 
Intr 
Intr 
Intr 
Term 

>4220632 

len = 

Sngl 

>4220632 

len = 

Sngl 

>4220632 

len = 

Term 
Init 

>4220633 

len = 

Init 
Term 

>4220633 

len = 

Sngl 

>4220633 

len = 

Sngl 



/4634 
271 nex = 
43471 43741 
/21618 
1544 nex = 
52445 52155 
/21608 
2002 nex = 



15759 
16186 
16326 
16993 
17201 
17512 



16019 
16243 
16397 
17112 
17302 
17760 



/33442 
323 nex = 
38225 38547 
/11562 
1630 nex = 
40558 40363 
/31276 
539 nex = 



42146 
42336 



41798 
42218 



/6602 

1537 nex = 

24986 25221 
25635 26107 

/14561 

1210 nex = 

41689 40482 

/43004 

825 nex = 

43944 44768 
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2 510 nex 



Init 
Intr 
Intr 
Intr 
Intr 
Intr 
Intr 
Intr 
Intr 
Intr 
Term 



47139 
47335 
47521 
47676 
47876 
48100 
48246 
48458 
48705 
48914 
49196 



47255 
47437 
47586 
47799 
48010 
48160 
48322 
48613 
48833 
49046 
49300 



>4220635 
len = 



2031 



nex ■■ 



Init 
Intr 
Intr 
Intr 
Intr 
Intr 
Term 



38166 
38520 
38891 
39176 
39361 
39711 
39886 



38351 
38541 
39017 
39266 
39450 
39811 
40196 



>4220635 
len = 



2 02 8 nex = 



Init 
Intr 
Intr 
Intr 
Intr 
Intr 
Term 



38172 
38520 
38891 
39176 
39361 
39711 
39886 



38351 
38541 
39017 
39266 
39450 
39811 
40199 



>4220635 



len = 
Sngl 
>4220635 
len = 



40942 41567 
725758 
1948 nex = 



Init 
Intr 
Intr 
Intr 
Intr 
Term 



44440 
44775 
45285 
45644 
45969 
46128 



44586 
44978 
45530 
45886 
46034 
46387 



>4220635 

60 



/40185 
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len = 
Sngl 
>4220636 
len = 



2132 



nex 



Init 
Intr 
Intr 
Intr 
Intr 
Intr 
Intr 
Intr 
Intr 
Term 



22952 
23266 
23412 
23692 
23869 
24090 
24233 
24382 
24578 
24783 



23071 
23317 
23607 
23771 
23985 
24140 
24299 
24485 
24682 
25083 



>4220636 
len = 



/121993 
1306 nex 



Term 
Intr 
Intr 
Intr 
Init 



37609 
38097 
38318 
38425 
38699 



37394 
37965 
38196 
38399 
38603 



>4220637 

len = 

Term 
Init 

>422Q637 
len = 
Sngl 

>4220637 
len = 



729678 

1417 nex = 

13304 12526 
13942 13414 

/19279 

614 nex = 

17264 17877 

/26411 

1523 nex = 



Init 
Intr 
Intr 
Term 



36910 37153 

37747 37816 

37913 38021 

38124 38432 

/5961 



Init 
Intr 
Intr 
Term 



1390 nex 



38657 
39193 
39403 
39552 



38747 
39308 
39466 
40045 
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>4220637 
len = 



732574 
2029 nex 



Init 
Intr 
Intr 
Term 



52862 53196 

53299 53516 

53613 53860 

53997 54890 



>4220638 
len = 



736765 
1180 nex 



Term 
Intr 
Init 



10773 10592 
11443 10925 
11771 11524 



>4220638 
len = 



716377 
1782 nex 



Term 
Intr 
Intr 
Init 



10773 10360 

11443 10925 

11826 11524 

12141 11914 



>4220638 
len = 



Init 
Term 



77015 
586 ne: 



30811 30933 
31055 31396 



>4220638 
len = 



719110 
134 0 nex = 



Term 
Init 



33848 32699 
34038 33944 



>4220638 
len = 



7102368 
73 0 nex 



Init 
Term 



>4220640 
len = 



Term 
Init 



40029 40250 
40282 40754 



732066 
8 75 nex 



778 350 
1099 1015 



>4220640 
len = 



7101893 
646 nex 



Sngl 1489 2131 

60 



0 



Reference No. 2750-942P 



>4220640 

Init 
Intr 
Intr 
Intr 
Intr 
Term 

>4220640 

len = 
Sngl 
>4220640 

len = 
>4220640 

len = 
Sngl 
>4220640 

len = 

Term 
Intr 
Intr 
Init 

>4220640 

len = 

Term 
Intr 
Init 

>4220640 

len = 

Term 
Intr 
Intr 
Init 

>4220640 

len = 



/6915 
2050 



21763 
22330 
22578 
22894 
23311 
23473 



nex = 

22063 
22386 
22783 
23167 
23382 
23804 



728838 

594 nex = 

27690 27097 

733726 

1839 nex = 

/37012 

441 nex = 

31588 32028 

/20116 

1516 nex = 

32430 31988 

32676 32525 

33332 32969 

33503 33404 

/6156 

1498 nex = 

32430 32015 
32676 32525 
33512 32969 

/10926 

1533 nex = 

32430 32015 

32676 32525 

33332 32969 

33547 33404 

72767 

1053 nex = 



Term 34226 34114 
Intr 34482 34391 
60 Init 34843 34602 
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>4220640 

len = 

Init 
Intr 
Intr 
Intr 
Term 



Term 
Intr 
Intr 
Intr 
Init 

>4220640 

len = 

Sngl 

>4220640 

Init 
Intr 
Intr 
Term 

>4220640 

len = 

Init 
Intr 
Intr 
Intr 
Intr 
Intr 
Intr 
Term 

>4220640 
len = 
Sngl 

>4220641 
len = 



1393 nex = 

36025 36150 

36243 36283 

36497 36565 

36651 36707 

36797 37018 

/6800 

1697 nex = 



45977 
46279 
46510 
46828 
47542 



45853 
46055 
46382 
46790 
47110 



/141813 
951 nex = 
5195 4245 

/40646 
37 64 nex = 



65692 
66421 
66651 
66834 



65939 
66558 
66740 
66890 



1814 nex = 



71705 
71931 
72059 
72390 
72546 
72752 
72899 
73185 



71840 
71971 
72313 
72450 
72670 
72814 
73023 
73518 



/10103 
610 nex = 
82414 83015 
/41308 
1514 nex = 



60 



Term 



16718 16586 



0 
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Intr 


16848 


16803 




Intr 


17051 


16951 


- 


Intr 


17286 


17149 




Intr 


17585 


17487 




Init 


17849 


17733 


_ 


>4220641 


/11265 




len = 


1899 


nex = 


6 


Term 


28923 


28663 


- 


Intr 


29185 


29124 




Intr 


29344 


29270 


- 


Intr 


29971 


29425 




Intr 


30137 


30073 




Init 


30561 


30314 


_ 


>4220641 


/1152 




len = 


1189 


nex = 


3 


Term 


68284 


68140 




Intr 


68713 


68513 


- 


Init 


69328 


69078 




>4220643 


/125902 




len = 


550 


nex = 


1 


Sngl 


11332 


11876 


+ 


>4220643 


/8788 




len = 


1231 


nex = 


3 


Term 


12170 


11960 




Intr 


12444 


12249 


: 


Init 


13190 


12769 


_ 


>4220643 


/10876 




len = 


1876 


nex = 


6 


Init 


13443 


13678 


+ 


Intr 


13778 


13880 


+ 


Intr 


14150 


14217 


+ 


Intr 


14462 


14653 


+ 


Intr 


14738 


14856 


+ 




14959 


15318 


+ 


>4220643 


/109639 




len = 


2710 


nex = 


5 


Term 


18433 


18225 




Intr 


20036 


19936 




Intr 


20200 


20123 




Intr 


20322 


20279 




Init 


20572 


20403 
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len = 


1549 


nex = 


6 




5 


Term 


27866 


27561 


_ 


0 




Intr 


28045 


27980 


_ 


0 




Intr 


28239 


28143 


- 


0 






28564 


28318 








Intr 


28840 


28748 


- 


0 


10 


Init 


29109 


28937 








>422 064 3 


72347 








len = 


833 


nex = 


1 




15 














Sngl 


35179 


34347 


- 


0 




>4220643 


/10875 






20 


len = 


2193 




6 






Term 


2953 


2902 


- 


0 




Intr 


3203 


3096 




0 




Intr 


3434 


3291 




0 


25 


Intr 


3609 


3517 


_ 


0 




Intr 


3792 


3721 


_ 


0 




Init 


4031 


3873 


_ 


0 




>4220643 


/126592 






30 














len = 


2377 


nex = 


6 






Term 


7320 


7059 


- 


0 




Intr 


7755 


7637 




0 


35 


Intr 


8012 


7841 




0 




Intr 


8587 


8399 


_ 


0 




Intr 


9007 


8930 


_ 


0 




Init 


9247 


9210 


_ 


0 


40 


>4220644 


/40534 








len = 


2331 




6 






Term 


14665 


14123 


- 


0 


45 


Intr 


15328 


14749 




0 




Intr 


15498 


15413 




0 




Intr 


15667 


15597 


_ 


0 




Intr 


15888 


15738 




0 




Init 


16453 


16250 


- 


0 


50 














>4220644 


/39041 








len = 


1930 


nex = 


5 




55 


Init 


17880 


18258 


+ 


0 




Intr 


18361 


18441 


+ 


0 




Intr 


18898 


19002 


+ 


0 




Intr 


19139 


19288 


+ 


0 




Term 


19396 


19805 


+ 


0 
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len = 


2758 


nex = 


8 




5 


Init 


3859 


4029 


+ 


0 




Intr 


4630 


4933 


+ 


0 




Intr 


5050 


5142 


+ 


0 






5216 


5349 








Intr 


5438 


5564 


+ 


0 


10 


Intr 


5941 


6046 


+ 


0 




Intr 


6174 


6245 


+ 


0 




Term 


6344 


6616 


+ 


0 




>4220644 


/40770 






15 














len = 


2268 


nex = 


10 






Term 


60836 


60743 


- 


0 




Intr 


60995 


60915 






20 


Intr 


61151 


61082 


I 


0 




Intr 


61384 


61296 


_ 


0 




Intr 


61819 


61711 


- 


0 




Intr 


62039 


61921 








Intr 


62183 


62124 


_ 


0 


25 


Intr 


62462 


62339 


_ 


0 




Intr 


62674 


62574 


_ 


0 




Init 


63010 


62903 


_ 


0 




>4220644 


/37152 






30 














len = 


2246 


nex = 


5 






Init 


68479 


69417 




0 




Intr 


69503 


69580 


+ 


0 


35 


Intr 


69662 


69773 


+ 


0 




Intr 


70259 


70309 


+ 


0 




Term 


70434 


70724 




0 




>4220645 


/19689 






40 














len = 


1479 


nex = 


3 






Init 


18327 


18524 








Intr 


18628 


18878 


t 


0 


45 


Term 


19278 


19805 




0 




>4220645 


/102797 








len = 


1492 








50 














>4220645 


/40166 








len = 


430 


nex = 


1 




55 


Sngl 


19392 


19821 


+ 


0 




>4220645 


78735 








len = 


1330 




6 





Reference No. 2750-942P 



Term 
Intr 
Intr 
Intr 
Intr 
Init 

>4220645 

len = 

Init 
Intr 
Term 

>4220645 

len = 

Term 
Intr 
Intr 
Intr 
Init 

>4220645 
len = 
Sngl 

>4220645 

len = 

Init 
Intr 
Intr 
Intr 
Term 

>4220645 

len = 

Init 
Term 

>4235150 
len = 
Sngl 

>4235150 
len = 



20047 19864 

20212 20150 

20365 20303 

20513 20457 

20721 20611 

21190 20957 

729889 

1879 nex = 

24415 24465 

24561 25553 

25635 26068 

724775 

1521 nex = 



2820 
3077 
3417 
3923 
4112 



2592 
2961 
3180 
3810 
4012 



78347 
531 nex = 
53094 52564 
730517 
1717 nex = 



69635 
70115 
70339 
70658 
71018 



70034 
70245 
70572 
70913 
71352 



713425 
573 nex 



70771 
71018 



70913 
71340 



715739 
894 nex = 
33125 32244 
721075 
811 nex = 



Init 52126 52284 
60 Term 52598 52936 



Reference No. 2750-942P 



>4235150 

len = 

Init 
Term 

>4235150 

len = 

Sngl 

>4235150 

len = 

Sngl 

>4235150 

len = 

Term 
Intr 
Init 

>4235150 

len = 

Intr 
Init 

>4235150 

len = 

Term 
Intr 
Init 

>4249393 

len = 

Term 
Init 

>4249393 

len = 

Init 
Intr 
Intr 
Intr 
Intr 



/110090 

778 nex = 

52159 52284 
52598 52936 

/30523 

537 nex = 

54031 54567 

/32316 

521 nex = 

56193 56713 

/111638 

1210 nex = 

71107 70335 
71256 71195 
71543 71346 

/32272 

1224 nex = 

71107 70354 
71256 71195 
71577 71346 

/25124 

1278 nex = 

71107 70354 
71256 71195 
71631 71346 

/6819 

1766 nex = 

43545 42141 
43906 43634 

/6017 

2858 nex = 

47446 47702 

47789 47851 

47928 48017 

48101 48160 

48246 48311 
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Intr 48386 48451 

Intr 48684 48748 

Intr 48827 48919 

Intr 49182 49266 

Intr 49347 49390 

Intr 49467 49529 

Intr 49655 49708 

Intr 49773 49832 

Term 49904 50303 



Init 49619 49708 

Intr 49773 49832 

Term 49904 50282 

>4249393 /38690 

len = 2474 nex = 

Term 51247 50779 

Intr 51460 51331 

Intr 51783 51576 

Init 53252 52424 

>4249393 /34081 

len = 1122 nex = 

Init 65982 66175 

Intr 66322 66532 

Intr 66615 66784 

Term 66874 67103 

>4249393 /39971 

len = 655 nex = 

Sngl 72891 72237 

>4262221 /40457 

len = 532 nex = 

Sngl 10235 10766 

>4262221 /19163 

len = 1709 nex = 

Term 17710 16611 

Init 18319 18101 

>4262221 /17996 

len = 2650 nex = 



60 



Init 



24589 25919 



0 



Reference No. 2750-942P 



Term 

>4262221 

len = 

Term 
Intr 
Intr 
Intr 
Intr 
Init 

>4262221 

len = 

Term 
Intr 
Intr 
Intr 
Intr 
Init 

>4262221 

len = 

Term 
Intr 
Intr 
Intr 
Intr 
Init 

>4262221 

len = 

Sngl 

>4262221 

Sngl 

>4262221 

len = 

Term 
Intr 
Intr 
Init 



26031 27234 



1422 nex = 

42754 42345 

42956 42854 

43130 43051 

43426 43369 

43616 43519 

43766 43687 



/108290 



1635 



nex : 



42754 42345 

42956 42854 

43130 43051 

43426 43369 

43616 43519 

43869 43687 

/8949 

2493 nex = 

45313 44902 

45683 45592 

45851 45766 

46447 46219 

46760 46534 

47394 46849 

/3831 

1431 nex = 

59938 58508 

/120459 

6 70 nex = 

61325 60659 

/22418 

1779 nex = 

5241 4407 
5454 5345 
5819 5550 
6185 5932 



/37962 



len 

60 



1312 nex = 



3 



Reference No. 2750-942P 



Init 
Intr 
Term 



73550 73783 
73889 74305 
74608 74861 



>4262221 
len = 



/2877 
2110 ne 



Term 
Intr 
Intr 
Intr 
Intr 
Intr 
Init 

>4263038 

len = 

Sngl 

>4263038 

len = 

Sngl 

>4263373 

len = 



75024 
75175 
75424 
75663 
75879 
76233 
76452 



74744 
75108 
75248 
75511 
75796 
76087 
76319 



795374 
430 nex = 
43068 43493 
73879 
64 2 nex = 
43783 43142 
7255 
7618 nex = 



Term 
Intr 
Intr 
Init 

>4263373 

len = 



38456 
40079 
43641 
44004 



36387 
40029 
40959 
43817 



74059 
7616 



nex = 



Term 
Intr 
Intr 
Init 



38456 
40079 
43641 
44004 



36389 
40029 
40959 
43817 



>4263373 
len = 



742686 
1090 nex 



Term 
Init 



58099 57610 
58697 58409 



>4263373 
len = 



728019 
1179 nex 



Term 58099 57546 
Init 58724 58409 

60 



Reference No. 2750-942P 



Init 44628 44729 



Intr 
Intr 
Intr 
Term 



44822 44985 

45066 45099 

45171 45264 

45352 45476 

742384 



Init 
Intr 
Intr 
Intr 
Term 



868 nex = 

44631 44729 

44822 44985 

45066 45099 

45171 45264 

45352 45498 



/34420 



1907 nex 



Term 
Intr 
Intr 
Intr 
Intr 
Intr 
Intr 
Init 

>4263586 

len = 

Init 
Intr 
Intr 
Intr 
Intr 
Intr 
Intr 
Term 

>4263694 

len = 

Sngl 

>4263694 



54050 
54344 
54536 
54749 
54915 
55154 
55337 
55771 



2153 

55911 
56483 
56661 
56893 
57082 
57205 
57350 
57841 



53865 
54138 
54429 
54633 
54829 
54990 
55226 
55407 



56299 
56550 
55794 
56981 
57139 
57257 
57562 
58063 



/19875 
171 nex = 
10500 10333 
/31665 
73 3 nex = 



60 



Term 
Intr 
Init 



10901 10692 
11155 11017 
11424 11303 
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>4263694 
len = 



/40344 
701 nex 



Term 
Init 

>4263694 
len = 
Sngl 

>4263694 
len = 



14673 14535 
14888 14792 

/159279 

746 nex = 

16225 16970 

/32907 



790 



nex = 



Term 
Intr 
Init 



29067 28667 
29226 29161 
29452 29310 



1706 



Term 
Intr 
Intr 
Intr 
Intr 
Intr 
Init 



29067 
29225 
29522 
29704 
29841 
30029 
30183 



28646 
29161 
29310 
29617 
29804 
29932 
30118 



>4263694 
len = 



1351 



nex 



Term 
Intr 
Intr 
Init 

>4263694 

len = 

Init 
Intr 
Intr 
Intr 
Term 



43833 43628 

44147 44072 

44489 44259 

44978 44842 

/13834 

1330 nex = 



49040 
49251 
49548 
49937 
50123 



49096 
49313 
49848 
50002 
50363 



>4263694 
len = 



1270 nex = 



Init 
Intr 
Intr 



49251 
49548 
49937 
50123 



49313 
49848 
50002 
50363 
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>4263694 

len = 

Init 
Intr 
Term 

>4263694 

Len = 

Term 
Intr 
Intr 
Intr 
Intr 
Intr 
Intr 
Intr 
Init 

>4263694 
len = 
Sngl 

>4263694 

len = 

Init 
Term 

>4263694 

len = 

Init 
Term 

>4263753 

len = 

Term 
Intr 
Intr 
Intr 
Init 

>4263753 

len = 



49575 49848 
49937 50002 
50123 50328 

/15582 

2721 nex = 

52117 51580 

52340 52248 

52602 52434 

52976 52716 

53226 53078 

53488 53326 

53624 53560 

53954 53704 

54300 54063 

/21201 

7 07 nex = 

57642 58348 

/35447 

804 nex = 

79508 79806 
79888 80311 

/37871 

4 60 nex = 

79509 79806 
79888 79968 

/29201 

1898 nex = 

23901 23530 

24129 24008 

24602 24242 

25041 24923 

25424 25374 

/38824 

2017 nex = 



Term 23901 23444 
Intr 24129 24008 
60 Intr 24602 24242 



Reference No. 2750-942P 



Intr 
Init 

>4263753 

len = 

Term 
Intr 
Intr 
Init 

>4263753 

len ~ 

Term 
Intr 
Init 

>4263753 

len = 

Term 
Intr 
Intr 
Intr 

Intr 
Intr 
Intr 
Intr 
Init 

>4263753 

len = 

Term 
Intr 
Init 

>4263753 

len = 

Sngl 

>4263753 

len = 

>4263753 

len = 

Sngl 



25041 
25447 



24923 
25374 



/32341 

1853 nex = 

25964 25590 

26181 26057 

26645 26279 

27442 27181 

/18342 

1999 nex = 

25964 25587 
26181 26057 
26645 26279 

/142593 

2369 nex = 



27931 
28203 
28363 
28872 
29202 
29421 
29591 
29825 
30052 



27684 
28139 
28298 
28453 
29080 
29303 
29498 
29715 
29909 



/114575 

2332 nex = 

40874 40258 
42140 40970 
42589 42333 

/104891 

109 0 nex = 

49712 49632 

734854 

771 nex = 

/7959 

1461 nex = 

49712 49320 



>4263753 

60 



/22013 
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Init 
Intr 
Term 



59531 59715 
60399 60450 
60528 61003 



>4263753 

len = 

Init 
Term 

>4263753 
len = 
Sngl 

>4263753 
len = 



/121365 

1415 nex = 

59581 59715 
60399 60450 

/8599 

1015 nex = 

72614 72191 

/42931 

2493 nex = 



Term 
Intr 
Intr 
Init 

>4263753 

Term 
Intr 
Intr 
Init 

>4263762 

len = 



72614 72054 

73868 73162 

74336 74237 

74546 74430 

/33691 

2530 nex = 



72614 
73868 
74336 
74574 



72054 
73162 
74237 
74430 



724286 
2117 nex 



Init 
Intr 
Term 

>4263774 

len = 



71346 71674 
71760 71795 
73186 73462 

/9149 

1339 nex = 



Term 
Intr 
Init 



11294 10832 
11773 11641 
12170 11858 



>4263774 
len = 



78134 
1118 nex 



60 



Term 
Init 



959 793 
1290 1112 
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>4263774 
len = 



Term 
Init 



>4263774 
len = 



959 618 
1290 1112 



Init 
Term 

>4263774 
len = 
Sngl 

>4263774 

len = 

Term 
Init 

>4263774 

len = 

Init 
Intr 
Intr 
Intr 
Intr 
Intr 
Intr 
Intr 
Term 

>4263774 

len = 

Init 
Intr 
Intr 
Intr 
Intr 
Term 

>4263774 

len = 



22583 
23358 



571 



22724 
23637 



nex = 



29990 30560 

722447 

1408 nex = 

2804 2015 
3423 3330 



2445 

4021 
4472 
4643 
4808 
4958 
5677 
5890 
6073 
6264 



2290 

45101 
45981 
46189 
46402 
46569 
46747 



nex = 

4390 
4567 
4735 
4881 
5224 
5804 
5985 
6178 
6465 



nex = 

45783 
46103 
46282 
46484 
46658 
47387 



/2403 
1055 nex 



Term 
Intr 
Init 



62305 62089 
62745 62486 
63143 63001 
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>4309719 

len = 

Term 
Init 

>4309719 

len = 

Init 
Term 

>4309719 

len = 

Term 
Intr 
Intr 
Intr 
Init 

>4309719 

len = 

Term 
Init 

>4309719 
len = 
Sngl 

>4309719 

len = 

Init 
Intr 
Term 

>4309719 

len = 

Init 
Intr 
Intr 
Intr 
Intr 
Term 



738985 

748 nex = 

21312 20917 
21664 21527 

7692 

1100 nex = 

24887 24997 
25486 25986 

75695 

1349 nex = 

26228 25952 

26479 26399 

26755 26593 

26979 26869 

27300 27066 

7105626 

1111 nex = 

33999 33726 
34836 34438 

74845 

1063 nex = 

43384 44446 

72731 

971 nex = 

60881 61402 
61481 61699 
61778 61851 

732047 

1818 nex = 

65313 65489 

65653 65708 

65790 65882 

65980 66050 

66683 66787 

66874 67130 



>4309719 



78068 



len = 

60 



2380 nex = 
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20 



Init 
Intr 
Intr 
Intr 
5 Intr 
Intr 
Intr 
Term 

10 >4309719 
len = 
Init 

1 5 Term 
>4309719 
len = 
Sngl 
>4309747 

2 5 len = 

Init 
Intr 
Intr 

3 0 Intr 

Term 

>4314354 

3 5 len = 

Init 
Term 

40 >4314354 
len = 
Term 

45 Intr 
Intr 
Intr 
Init 

50 >4314354 
len = 
Sngl 
>4314354 
len = 
60 Init 



55 



81136 81254 

82137 82165 

82264 82329 

82552 82659 

82754 82819 

82909 83029 

83121 83202 

83284 83515 

/150251 

550 nex = 

83121 83202 
83284 83538 

735743 
1049 nex = 
89585 90633 

/34060 

1132 nex = 

38591 38699 

38809 38862 

39074 39183 

39346 39430 

39528 39722 

/1650 

1096 nex = 

33624 33916 
34315 34719 

/19093 

1161 nex = 

36756 36705 

36980 36836 

37153 37045 

37293 37236 

37466 37365 

/107116 

3 79 nex = 

41136 41514 

737274 

2230 nex = 

46941 47138 
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Intr 
Intr 
Intr 
Intr 
Term 



47607 48124 

48222 48451 

48548 48689 

48776 48837 

48917 49168 



>4314354 
len = 
Sngl 

>4314374 
len = 



/4014 



Term 
Intr 
Init 



443 32 
853 747 
1041 1020 



>4314374 
len = 



738329 
2442 nex 



Init 
Intr 
Intr 
Intr 
Intr 
Intr 
Intr 
Term 

>4314374 

len = 



2646 
2782 
3021 
3181 
3316 
3473 
3616 
3771 



2694 
2852 
3085 
3225 
3396 
3540 
3664 
3925 



Term 
Init 



>4314374 
len = 



22049 21488 
22326 22150 



2177 



nex 



Term 
Intr 
Intr 
Intr 
Intr 
Intr 
Intr 
Intr 
Init 



42829 
43040 
43426 
43549 
43684 
43912 
44106 
44262 
44687 



42511 
42944 
43260 
43516 
43638 
43761 
44015 
44214 
44480 



>4314374 
len = 



/1126 
1671 nex 



Term 57444 57236 
60 Intr 58681 57651 
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Init 
>4314374 
5 len = 

>4325340 
len = 

10 

Sngl 
>4325340 
15 len = 

Sngl 
>4325340 

20 

len = 
Sngl 

25 >4325352 
len - 
Sngl 

30 

>4325365 

len = 

35 Term 
intr 
Intr 
Intr 
Intr 

4 0 Init 
>4325365 
len = 

45 

Init 
Term 



50 



>4325365 



len = 
Sngl 

55 >4325365 
len = 
Sngl 

60 



58906 58778 

726333 

1393 nex = 

/27144 

46 0 nex = 

950 630 

/19541 

67 0 nex = 

950 636 

/122423 

624 nex = 

30231 30854 

79742 

131 nex = 

51030 51160 

733985 

2158 nex = 

346 1 

977 867 

1268 1059 

1516 1383 

1864 1705 

2158 1967 

735963 

946 nex = 

65779 66009 
66230 66707 

73552 

1150 nex = 

93731 94876 

76659 

1125 nex = 

93738 94862 
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>4325365 

len = 

Sngl 

>4335711 

len = 

Sngl 

>4335711 

len = 

>4335711 

len = 

Sngl 

>4335711 

len = 

Init 
Intr 
Term 

>4335711 

len = 

Init 
Term 

>4335711 

len = 

Init 
Intr 
Term 

>4335711 

len = 

Init 
Intr 
Term 

>4335711 

len = 

Sngl 

>4335744 



736687 
1990 nex = 
99721 100926 
/5 

1270 nex = 
115475 114208 
/9064 
650 nex = 
/7216 
1048 nex = 
35552 34505 
77792 

1193 nex = 

46238 46470 
46624 46739 
47125 47430 

7122361 

718 nex = 

46239 46470 
46624 46956 

77201 

1192 nex = 

46239 46470 
46624 46739 
47125 47430 

740916 

1194 nex = 

46239 46470 
46624 46739 
47125 47432 

79723 

2068 nex = 

82273 80206 

741585 



Reference No. 2750-942P 



Term 
Intr 
Intr 
Intr 
Intr 
Intr 
Intr 
Intr 
Init 

>4335744 
len = 
Sngl 

>4335744 

len = 

Init 
Intr 
Intr 
Intr 
Intr 
Intr 
Intr 
Term 

>4335744 

len = 

Init 
Intr 
Term 

>4337186 

len = 

Sngl 

>4337186 

len = 

Sngl 

>4337186 

len = 

Sngl 



10913 
11117 
11445 
11721 
12110 
12261 
12404 
12578 
13107 



10690 
11009 
11232 
11575 
11952 
12202 
12337 
12482 
12916 



735226 
900 nex = 
39634 40533 
/41376 
2132 nex = 



47695 
47825 
48035 
48223 
48485 
48707 
48859 
48974 



47718 
47884 
48129 
48395 
48597 
48752 
48899 
49259 



/1697 

1601 nex = 

58565 58779 
59480 59534 
59634 60165 

733524 

173 nex = 

2653 2825 

7141837 

3 92 nex = 

28822 28431 

7110587 

720 nex = 

39509 39644 



>4337186 716295 

60 
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3190 



Init 
Intr 
Intr 
Intr 
Intr 
Intr 
Intr 
Intr 
Intr 
Intr 
Term 

>4337186 

len = 

Term 
Intr 
Intr 
Intr 
Intr 
Intr 
Intr 
Intr 
Intr 
Init 

>4337186 
len = 
Sngl 

>4371278 

len = 

Init 
Intr 
Term 

>4371278 

len = 

Sngl 

>4371278 

len = 

Sngl 

>4371278 

len = 



44064 
44410 
44623 
44771 
45009 
45167 
45503 
45889 
46155 
46520 
46942 



44289 
44537 
44694 
44884 
45077 
45277 
45565 
46044 
46235 
46648 
47253 



/7536 
2080 ne 



62706 
62883 
63047 
63232 
63400 
63600 
63730 
63936 
64142 
64462 



62383 
62815 
62962 
63146 
63322 
63523 
63689 
63836 
64032 
64286 



/6917 
277 nex = 
766 1042 

/15139 

1126 nex = 

15864 16132 
16428 16560 
16673 16989 

723523 

67 9 nex = 

18840 19518 

/108612 

457 nex = 

18849 19305 

/11998 

1361 nex = 



60 



Term 



20141 19887 



0 
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Intr 
Init 


20371 
20758 
21247 


20231 
20463 
20880 


- 


0 
0 


5 


>4371278 


/25220 








len = 


1413 




4 






Init 

Intr 
Term 


24105 
24590 
24977 
25216 


24468 
24885 
25117 
25517 


+ 

+ 
+ 


0 

0 
0 




>4371278 
len = 


/38965 
1137 nex = 


4 




20 


Init 
Intr 
Intr 
Term 


24195 
24590 
24977 
25216 


24468 
24885 
25117 
25331 


+ 

+ 
+ 


0 
0 
0 
0 




>4371278 


/3571 






25 


len = 


1941 


nex = 


5 




30 


Init 
Intr 
Intr 
Intr 
Term 


38901 
39208 
39462 
39739 
40334 


39030 
39356 
39650 
39907 
40841 


+ 
+ 
+ 
+ 


0 
0 
0 
0 
0 




>4371278 


723542 






35 


len = 


381 


nex = 


1 






Sngl 


4113 


4343 


+ 


0 


4 0 


>4371278 
len = 


/1517 
850 nex = 


2 




45 


Init 
Term 

>4371278 


43998 44170 
44506 44846 

72454 


+ 
+ 


0 
0 




len = 


1371 


nex = 


3 




50 


Term 
Intr 
Init 


45469 
45776 
46370 


45000 
45574 
45982 


_ 


0 
0 
0 


55 


>4371278 
len = 


738677 
1210 nex = 


1 






Sngl 


46951 


48155 


+ 


0 



60 >4376087 



792448 
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Init 100575 100763 
Term 100977 101282 



>4376087 /30438 
len = 2020 nex 



Init 
Intr 
Intr 
Intr 
Intr 
Term 



100575 
100977 
101420 
101697 
101967 
102338 



100763 
101158 
101606 
101858 
102269 
102594 



>4376087 /18909 
len = 2031 nex 



Term 
Intr 
Intr 
Intr 
Intr 
Intr 
Intr 
Intr 
Init 



110792 
110969 
111157 
111325 
111459 
111712 
112024 
112222 
112536 



110506 
110872 
111078 
111254 
111418 
111551 
111796 
112153 
112404 



len = 

Term 
Intr 
Intr 
Intr 
Intr 
Init 

>4376087 

len = 

Term 
Init 

>4376087 

len = 

Term 
Intr 
Init 

>4376087 

len = 



122365 122161 

122548 122446 

122806 122654 

122940 122913 

123065 123029 

123336 123143 

/18266 

1184 nex = 

124539 123889 
125072 124716 

/1794 

835 nex = 

12383 12097 
12576 12475 
12931 12658 

/11274 

1278 nex = 
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Init 130626 130868 

Intr 130994 131032 

Intr 131111 131263 

Intr 131398 131473 

Term 131661 131903 



>4376087 

len = 

Init 
Intr 
Intr 
Intr 
Intr 
Intr 
Intr 
Term 

>4376087 

len = 

Sngl 

>4375087 

len = 

>4376087 

len = 

Sngl 

>4376087 

len = 

Sngl 

>4376087 

len = 

Term 
Intr 
Intr 
Intr 
Init 

>4388714 

len = 

Init 
Intr 
Intr 
Term 



/13475 

24 9 8 nex = 

174373 174669 

175057 175144 

175243 175307 

175412 175460 

175723 175775 

176218 176333 

176431 176488 

176591 176870 

/19707 

799 nex = 

26682 25892 

/27837 

12 9 8 nex = 

/39370 

1124 nex = 

27194 26071 

/16131 

691 nex = 

43223 42533 

/108940 

1463 



89305 
89463 
89742 
90200 
90417 



nex = 

88955 
89386 
89557 
90148 
90284 



/40273 

3039 nex = 

79641 79729 

79806 79946 

81346 81476 

81558 82484 
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>4388714 
len = 



739234 
1951 nex ■ 



15 



Init 
Intr 
Term 

10 >4388714 
len = 
Sngl 
>4406752 
len = 

2 0 Init 

Intr 
Intr 
Intr 
Term 

25 

>4406752 
len = 

3 0 Init 

Term 

>4406752 
3 5 len = 



79641 79729 
79806 79946 
81346 81472 

/103540 

4 77 nex = 

82014 82484 

72021 

1990 nex = 

16077 16503 

16583 16721 

16810 16963 

17036 17159 

17250 17624 

720712 

1210 nex = 

42530 42826 
42920 43154 

7390 

627 nex = 



Init 
Term 



40 >4406752 
len = 



42528 42826 
42920 43154 



718785 
1039 nex ^ 



Term 
Init 



43763 43148 
44186 43973 



>4406752 
len = 



738210 
1270 nex = 



Term 
Init 



64451 63872 
65132 64859 



len = 
Sngl 



73964 
1154 nex = 
70202 69049 



60 >4406752 



717089 
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2174 nex 



Init 
Intr 
Intr 
Intr 
Intr 
Intr 
Intr 
Intr 
Term 

>4406776 

len = 

Init 
Intr 
Intr 
Term 

>4406776 

len = 

Sngl 

>4406776 

len = 

>4406776 

len = 

Sngl 

>4406790 

len = 

Sngl 

>4406790 

len = 

Init 
Intr 
Intr 
Intr 
Intr 
Intr 
Intr 
Intr 
Term 



77860 
78111 
78423 
78666 
78889 
79177 
79324 
79482 
79755 



78033 
78291 
78565 
78729 
78941 
79232 
79398 
79670 
80033 



736238 
2338 nex 



59059 
59414 
59810 
60442 



59158 
59731 
60366 
61396 



78529 
528 nex = 
60863 61390 
739285 
1992 nex = 
733274 
310 nex = 
61563 61868 
7110175 
473 nex = 
27736 27628 
714605 
2614 nex = 



72541 
72828 
73016 
73401 
73588 
73852 
74050 
74379 
74542 



72737 
72926 
73126 
73494 
73688 
73953 
74163 
74453 
74819 



>4406790 

60 



74489 
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len = 
Sngl 
5 >4406805 
len = 



10 



20 



Term 
Init 



>4406805 
len = 



Term 
Init 



>4406805 



len = 
Sngl 

25 >4406805 
len = 



30 



Term 
Init 

>4406805 

len 

3 5 

Term 
Intr 
Intr 
Init 

40 

>4406805 
len = 

4 5 Term 

Intr 
Intr 
Init 

50 >4406805 
len = 



Term 
Init 



>4406805 
len = 



396 nex = 

81642 81247 

/11178 

814 nex = 

29635 29022 
29835 29708 

/112970 

816 nex = 

29635 29024 
29839 29708 

/42057 

156 nex = 

33661 33506 

737664 

1390 nex = 

32476 32343 
33725 33309 

/40712 

1607 nex = 

34590 34119 

34839 34695 

35037 34925 

35725 35374 

735337 

1658 nex = 

2559 2284 

2866 2649 

3066 2950 

3279 3171 

718197 

992 nex = 

38065 37812 
38803 38299 

7105341 

1033 nex = 
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Sngl 

>4415905 

5 len = 

Init 
Term 

10 >4415905 
len = 
Term 

15 Intr 
Intr 
Init 



20 



>4415928 



len = 
Sngl 

25 >4415928 
len 



3 0 



Init 
Term 



>4415928 
len = 

35 

Term 
Intr 
Init 

40 >4415928 
len = 
Sngl 

45 

>4415928 
len = 
5 0 Sngl 

>4417264 
len = 

55 

Sngl 
>4417264 
60 len = 



51271 50239 

738827 

1224 nex = 

11227 12103 
12183 12450 

/14564 

1454 nex = 

2076 1784 

2319 2171 

2509 2408 

2694 2619 

/11006 
7 99 nex = 
15192 14394 

/10850 

819 nex = 

34311 34796 
34892 35129 

725575 

1297 nex = 

35365 35116 
35883 35801 
36412 36246 

77802 

550 nex = 

60672 61212 

733287 

3 94 nex = 

60674 61067 

71841 

634 nex = 

21959 22592 

738497 

1733 nex = 
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Term 
Init 



>4417264 
len = 



50530 49193 
50925 50620 



2530 nex 



Term 
Intr 
Intr 
Intr 
Intr 
Intr 
Init 



51442 
51661 
51937 
52307 
52643 
53125 
53657 



51135 
51539 
51791 
52197 
52499 
53032 
53503 



>4417264 
len = 



1577 nex ■■ 



Term 
Intr 
Intr 
Intr 
Intr 
Init 



55897 
56131 
56288 
56525 
56707 
57202 



55626 
55980 
56208 
56379 
56608 
57038 



len ■■ 



/18922 
2470 nex 



Init 
Intr 
Intr 
Intr 
Intr 
Intr 
Intr 
Intr 
Term 



82338 
82475 
82615 
82826 
83016 
83186 
83414 
83592 
83736 



82391 
82529 
82687 
82913 
83098 
83315 
83492 
83656 
84101 



2150 nex 



Init 
Intr 
Intr 
Intr 
Intr 
Intr 
Intr 
Intr 
Intr 
Term 



85855 
86197 
86405 
86591 
86902 
87066 
87232 
87439 
87622 
87762 



86098 
86292 
86479 
86659 
86978 
87154 
87361 
87529 
87687 
88004 



len = 

60 



892 nex = 



2 
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Term 
Init 

>4432811 

len = 

Init 
Intr 

>4432829 

len = 

Sngl 

>4432829 

len = 

Sngl 

>4432829 

len = 

Term 
Init 

>4432829 

len = 

Term 
Init 

>4432829 
len = 
Sngl 

>4432829 

len = 

Term 
Intr 
Intr 
Intr 
Intr 
Intr 
Init 

>4432829 

len = 

Sngl 



88380 87971 
88839 88459 

79846 

1030 nex = 

12818 12984 
13233 13498 
13588 13843 

/119200 

689 nex = 

14377 13689 

/14334 

511 nex = 

21464 21974 

/14111 

2126 nex = 

37251 36808 
37474 37342 

/10433 

935 nex = 

50590 50253 
51187 51056 

/3673 

83 3 nex = 

69750 70582 

/23166 

3474 



86292 
86521 
87019 
87222 
87948 
88192 
89394 



nex = 

85921 
86378 
86949 
87123 
87918 
88095 
89217 



797776 
790 nex = 
9122 8339 
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>4432847 
len = 

>4432847 

len = 

Init 
Term 

>4432847 

len = 

Sngl 

>4454004 



55664 
55970 



55782 
56339 



len 



2261 



nex ■- 



Term 
Intr 
Intr 
Intr 
Intr 
Intr 
Init 



17173 
17285 
17408 
17581 
17828 
18104 
19121 



16861 
17262 
17382 
17514 
17657 
17916 
18868 



>4454004 
len = 



2301 nex ■■ 



Term 
Intr 
Intr 
Intr 
Intr 
Intr 
Init 



17173 
17285 
17408 
17581 
17828 
18104 
18942 



16834 
17262 
17382 
17514 
17657 
17916 
18868 



>4454004 
len = 



2334 



nex = 



Term 
Intr 
Intr 
Intr 
Intr 
Intr 
Init 



17173 
17285 
17408 
17581 
17828 
18104 
18942 



16914 
17262 
17382 
17514 
17657 
17916 



>4454004 
len = 



/11559 
1123 nex ■ 



Init 2493 2663 
60 Intr 2815 2904 
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Intr 
Term 



Term 
Init 

>4454004 

len = 

Term 
Intr 
Init 



Init 
Intr 

2 5 Intr 

Term 

>4454004 

3 0 len = 

Init 
Intr 
Term 

35 

>4454004 
len = 
40 Sngl 

>4454004 
len = 



45 



50 



Init 
Intr 
Intr 
Term 



>4454004 
len = 
55 Sngl 
>4454022 
len 

60 



2993 3115 
3310 3615 

736536 

1469 nex = 

31479 31022 
32490 31900 

729369 

1586 nex = 

34339 33967 
34718 34509 
35552 34914 

738603 

1935 nex = 

42043 42575 

43023 43266 

43356 43461 

43541 43977 

726380 

1279 nex = 

606 927 
1278 1392 
1530 1881 

7150586 

714 nex = 

66988 57701 

760 

1550 nex = 

77637 77784 

78097 78344 

78424 78751 

78826 79186 

78252 
1648 nex = 
9384 9995 

723878 
1042 nex = 
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1369 

Term 11752 11125 - 0 

Intr 11994 11828 - 0 

Init 12166 12098 - 0 

5 >4454022 /40423 

len = 2478 nex = 10 

Init 17803 17967 + 0 

10 Intr 18546 18615 + 0 

Intr 18803 18927 + 0 

Intr 19076 19178 + 0 

Intr 19253 19349 + 0 

Intr 19432 19497 + 0 

15 Intr 19591 19729 + 0 

Intr 19823 19902 + 0 

Intr 19987 20091 + 0 

Term 20179 20280 + 0 

20 >4454022 76527 

len = 1964 nex = 3 

Term 33047 32545 - 0 

25 Intr 33895 33766 - 0 

Init 34508 34027 - 0 

>4454022 /39185 

30 len = 1852 nex = 5 

Init 35769 36127 + 0 

Intr 36534 36714 + 0 

Intr 36797 36938 + 0 

35 Intr 37035 37104 + 0 

Term 37198 37620 + 0 

>4454022 /14129 

40 len = 1813 nex = 5 

Init 35826 36127 + 0 

Intr 36534 36714 + 0 

Intr 36797 36938 + 0 

45 Intr 37035 37104 + 0 

Term 37198 37638 + 0 

>4454022 725397 

50 len = 1077 nex = 4 

Init 36551 36714 + 0 

Intr 36797 36938 + 0 

Intr 37035 37104 + 0 

55 Term 37198 37627 + 0 

>4454022 737278 



len = 

60 



9 
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Term 


42843 


42630 


_ 


Intr 


43042 


42943 


- 


Intr 


43257 


43130 




Intr 


43493 


43326 




Intr 


43710 


43595 




Intr 


43953 


43801 




Intr 


44188 


44045 




Intr 


44554 


44268 




Init 


45248 


44799 





>4454022 
len = 
Sngl 

>4454022 

len = 

Term 
Intr 
Init 

>4454022 
len = 
Sngl 

>4454022 

len = 

Term 
Intr 
Intr 
Init 

>4454022 

len = 

Term 
Init 

>4454022 

len = 

Term 
Init 

>4454022 

len = 



737786 

310 nex = 

50741 50436 

/108949 

1140 nex = 

50767 50437 
50907 50807 
51576 51263 

73802 

233 nex = 

55269 55037 

741988 

4725 nex = 

50767 50548 

50907 50807 

54477 51263 

55272 55024 

742101 

1286 nex = 



54477 
55272 



53987 
55024 



7115402 
974 nex 



54477 54301 
55272 55024 



7118240 
939 nex 



Term 54477 54334 
Init 55272 55024 

60 
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>4454022 



/32540 
938 nex 



Term 
Init 

>4454022 

len = 

Term 
Intr 
Intr 
Init 

>4454022 
len = 
Sngl 

>4454022 

len = 

Init 
Intr 
Term 

>4454022 

len = 

Init 
Intr 
Intr 
Intr 
Intr 
Intr 
Term 

>4454022 

len = 

Init 
Term 

>4454447 

len = 

Init 
Intr 
Term 

>4454447 

len = 



54477 54335 
55272 55024 



/92204 
4728 nex 



50767 
50907 
54477 
55275 



50548 
50807 
51263 
55024 



/660 

730 nex = 

57533 56808 

72946 

1489 nex = 

67882 68445 
68793 68893 
69004 69370 



/21949 
2013 



80581 
81338 
81487 
81634 
81901 
82128 
82416 



nex = 

80861 
81412 
81543 
81800 
82035 
82330 
82593 



/122649 

1106 nex = 

99722 99910 
100566 100827 

/1898 

855 nex = 

105950 106053 
106164 106220 
106331 106804 

/528 

1665 nex = 
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Term 
Inlt 
>4454447 



Term 
Intr 
Intr 
Intr 
Init 



107093 106729 
108393 107558 
/34908 



2650 



nex 



109037 108541 

109327 109116 

109565 109443 

109985 109788 

111189 110312 



>4454447 /34827 
len = 1645 nex 



Init 
Intr 
Term 



118166 118441 
118556 118784 
119096 119810 



Term 
Intr 
Intr 
Intr 
Intr 
Intr 
Intr 
Intr 
Intr 
Intr 
Init 

>4454447 

len = 

Sngl 

>4454447 

len = 

Sngl 

>4454447 

len = 

Init 
Intr 
Intr 
Intr 
Term 



14000 
14573 
14892 
15056 
15442 
15656 
15876 
16151 
16297 
16489 
17125 



nex = 

13685 
14085 
14646 
14980 
15368 
15555 
15758 
16028 
16247 
16391 
16761 



/16840 
730 nex = 
26481 25758 
/15103 
6 70 nex = 
28997 29662 
727566 
1108 



29987 
30173 
30358 
30525 
30777 



nex = 

30074 
30269 
30436 
30606 
31079 
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len • 



3319 



Init 
Intr 
Intr 
Intr 
Intr 
Intr 
Intr 
Intr 
Intr 
Intr 
Term 

>4454447 

len = 

Term 
Intr 
Intr 
Intr 
Init 

>4454447 

len = 

Sngl 

>4454447 

len = 

Sngl 

>4454447 

Sngl 

>4454447 

len = 

Sngl 

>4454447 

len = 

Init 
Term 



38766 39223 

39301 39429 

39523 39608 

39702 39828 

39928 39976 

40187 40269 

40380 40447 

40895 40969 

41157 41244 

41368 41484 

41575 42084 

734875 

1750 nex = 

42627 42287 

42873 42758 

43258 43042 

43617 43350 

44030 43729 

/2512 

682 nex = 

48986 49667 

/41949 

4 30 nex = 

48989 49412 

/7215 

566 nex = 

48989 49554 

/120852 

717 nex = 

48989 49705 

/10398 

1539 nex = 

61887 62281 
62817 63409 



len = 

60 



1055 nex = 



2 
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Term 
Init 



>4454447 
len = 



66889 66819 
67328 67140 



/14736 
970 nex ^ 



Term 
Init 



66889 66819 
67371 67140 



>4454447 
len = 
Sngl 
>4454585 



/33816 
389 nex = 
98602 98990 
/10680 



len ■ 



1458 



nex ■■ 



Init 
Intr 
Intr 
Term 



38285 38437 

38805 38874 

39093 39133 

39382 39742 



/105376 



Sngl 
>4454587 



68863 69530 
/1268 



len 



2668 



Term 
Intr 
Intr 
Intr 
Intr 
Intr 
Intr 
Intr 
Intr 
Intr 
Init 



73570 
73845 
74008 
74245 
74415 
74602 
74986 
75192 
75362 
75529 
75804 



73137 
73789 
73940 
74177 
74347 
74513 
74904 
75112 
75280 
75438 
75648 



>4454587 
len = 



1757 



Term 
Intr 
Intr 
Intr 
Intr 
Init 



76531 
76696 
76825 
76986 
77147 
77788 



76459 
76662 
76768 
76926 
77068 
77455 



>4454587 

60 



/1045 
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2613 nex ■ 



init 
Intr 
Intr 
Intr 
Intr 
Term 



92251 
92686 
93334 
93678 
94015 
94517 



92370 
92822 
93583 
93925 
94422 
94863 



>4455168 
len = 



Init 
Term 



39463 39668 
39795 40939 



>4455168 
len = 



Init 
Term 



39464 39668 
40110 40951 



>4455168 
len = 



2265 



nex 



Term 
Intr 
Intr 
Intr 
Intr 
Intr 
Intr 
Init 



41426 
41759 
41920 
42088 
42279 
42473 
42767 
43388 



41124 
41631 
41879 
42029 
42166 
42383 
42582 
43192 



>4455168 
len = 



Init 
Term 



48010 48378 
49184 49500 



>4455168 
len = 



1004 nex 



Term 
Intr 
Intr 
Intr 
Init 



65924 65732 

66098 66012 

66357 66172 

66540 66446 

66735 66635 



>4455168 
len = 



/6091 
1847 nex = 



Init 71711 71850 
Intr 72177 72209 
60 Intr 72291 72573 
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Intr 
Term 



72646 73126 
73255 73557 



>4455168 
len = 
Sngl 

>4455189 
len = 



/37329 
326 nex = 
77591 77916 
/34595 
1423 nex = 



Init 
Term 



16690 17210 
17612 17935 



>4455189 
len = 



/31962 
1362 nex 



Init 
Intr 
Term 

>4455189 

len = 



16690 16812 
17163 17210 
17612 17944 

/111016 

1319 nex = 



Init 
Intr 
Term 

>4455189 

len = 

Init 
Intr 
Intr 
Intr 
Term 



16607 16812 

17163 17210 

17612 17925 

/5367 

1551 nex = 

184 470 

860 924 

1005 1077 

1179 1232 

1353 1734 



>4455189 
len = 



735284 
183 0 nex = 



Term 
Intr 
Intr 
Intr 
Init 



33174 32856 

33398 33270 

33645 33478 

34173 33730 

34685 34427 



>4455189 
len = 



72574 
1558 nex ■■ 



Init 
Term 



35777 36240 
36966 37334 



60 >4455189 



733007 
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2 3 92 nex 

/29414 
1398 nex 



Init 
Intr 

1 0 Term 



45338 45550 
46160 46309 
46391 46735 



Term 
Intr 
Intr 
Intr 
Intr 
Init 



16 06 nex = 

51378 51155 

51592 51511 

51844 51711 

52006 51919 

52157 52111 

52760 52482 

/93281 



len : 



1559 



nex 



Term 
Intr 
Intr 
Intr 
Intr 
Init 



51378 
51592 
51844 
52006 
52157 
52760 



51202 
51511 
51711 
51919 
52111 
52482 



2816 



nex 



Init 
Intr 
Intr 
Intr 
Intr 
Intr 
Term 

>4455229 

len = 



1117 1501 

1589 1775 

1866 2116 

2667 2781 

2897 2989 

3429 3572 

3664 3932 

/25830 

2596 nex = 



Init 
Intr 
Intr 
Intr 
Intr 
Intr 
Intr 
Intr 
Intr 
Term 



16057 
16817 
16975 
17248 
17421 
17710 
17847 
18037 
18220 
18417 



16292 
16868 
17170 
17327 
17537 
17760 
17913 
18140 
18324 
18652 
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>4455229 

len = 

Init 
Intr 
Intr 
Term 



1244 nex = 

20496 20665 

20786 20812 

20941 21247 

21515 21739 



>4455229 
len = 
Sngl 

>4455229 
len = 



/111178 



42529 42245 
/31461 



Term 
Intr 
Init 



42587 42245 
43426 43256 
43908 43679 



>4455229 
len = 



1846 



Term 
Intr 
Intr 
Intr 
Init 

>4455229 

len = 



44919 44554 

45357 44992 

46012 45582 

46216 46084 

46399 46295 

732974 

2037 nex = 



Term 
Intr 
Intr 
Intr 
Intr 
Init 

>4455229 

len = 



47455 46948 

47942 47823 

48307 48161 

48622 48409 

48744 48708 

48984 48865 

/16463 



Init 
Intr 
Intr 
Term 

>4455262 

len = 



57208 57329 

57866 58020 

58096 58362 

58556 59359 

/27914 

2620 nex = 



Init 101628 101891 
Intr 102198 102288 
60 Intr 102390 102442 



0 
0 
0 
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Intr 
Intr 
Intr 
Intr 
Intr 
Intr 
Intr 
Term 

>4455262 

len = 

Init 
Intr 
Intr 
Intr 
Term 

>4455262 

len = 

Init 
Intr 
Intr 
Intr 
Term 

>4455262 

len = 

Sngl 

>4455262 

len = 

Sngl 

>4455262 

len = 

Sngl 

>4455262 

len = 

Sngl 

>4455262 

Sngl 



102552 102644 

102736 102807 

103074 103146 

103266 103311 

103418 103481 

103589 103642 

103729 103806 

103880 104247 

/4469 

1731 nex = 

104601 105127 

105304 105424 

105524 105750 

105860 106150 

106238 106331 

76952 

2335 nex = 

105304 105424 

105524 105750 

105860 106150 

106238 106443 

106521 106935 

/17600 

670 nex = 

23292 23961 

/107817 

299 nex = 

23641 23939 

/11205 

595 nex = 

76742 77336 

/33780 

67 0 nex = 

77699 78366 

/11383 

110 nex = 

92430 92539 



60 >4455262 



/18558 
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len = 
Sngl 
>4455290 
len = 



98939 99502 



Init 
Intr 
Term 



13247 13653 
14262 14502 
14622 14768 



len 



1930 nex ■■ 



Term 
Intr 
Intr 
Intr 
Intr 
Intr 
Init 



34312 
34498 
34712 
35303 
35416 
35693 
35927 



34003 
34422 
34600 
35236 
35353 
35629 
35782 



>4455290 
len = 



2801 nex ■■ 



Init 
Intr 
Intr 
Intr 
Intr 
Term 

>4455290 

len = 

Term 
Intr 
Intr 
Intr 
Init 



37303 37639 

37723 37815 

38133 38237 

38889 39041 

39597 39695 

39798 39926 

76626 

1582 nex = 

4300 3883 

4575 4506 

4877 4654 

5062 4986 

5464 5337 



>4455290 
len = 



738327 



1909 nex 



Term 
Intr 
Intr 
Intr 
Intr 
Init 



4300 3745 

4575 4506 

4732 4654 

4877 4823 

5062 4986 

5653 5337 



>4455290 
len = 



78965 
509 nex 



Reference No. 2750-942P 



Sngl 
>4455290 
len = 



2272 



nex 



Term 
Intr 
Intr 
Intr 
Intr 
Intr 
Intr 
Init 



69364 
69544 
69755 
69969 
70213 
70386 
70653 
71162 



68891 
69452 
69618 
69842 
70053 
70296 
70597 
70752 



>4455290 
len = 



/7191 
1255 nex 



Init 
Intr 
Intr 
Term 



88336 
88555 
88706 
88909 



88457 
88623 
88812 
89229 



/16314 
1256 nex 



Init 
Intr 

Intr 
Term 

>4455290 



88336 88457 

88555 88623 

88706 88812 

88909 89230 

/124124 

4 64 nex = 



Init 
Term 

>4455321 
len = 
Sngl 

>4455321 
len = 



88791 88812 
88909 89254 



722274 
326 nex - 



2293 



nex ■ 



Term 
Intr 
Intr 
Intr 
Intr 
Intr 
Init 



46985 
47138 
47302 
47493 
47690 
48106 
48909 



46617 
47080 
47219 
47383 
47591 
47773 
48641 



60 >4455321 



/31030 
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1700 nex ^ 



Term 
Intr 
Intr 
Intr 
Intr 
Init 



53798 
54066 
54274 
54493 
54739 
54988 



53289 
53875 
54158 
54413 
54643 
54836 



>4455321 
len = 



/32761 
2005 nex ■ 



Init 
Intr 
Intr 
Intr 
Term 

>4455339 

len = 



74537 74904 

75417 75734 

75871 75966 

76072 76139 

76237 76541 

/15213 



Term 
Init 



>4455339 



56170 55714 
57669 56842 



/36461 
2000 nex 



Term 
Intr 
Intr 
Intr 
Init 



63860 63223 

64081 63950 

64602 64187 

64869 64736 

65222 64975 



>4455339 



/33995 



1663 nex ^ 



Init 
Intr 
Intr 
Intr 
Intr 
Intr 
Term 



71457 
71779 
71985 
72096 
72362 
72557 
72814 



71689 
71876 
72015 
72195 
72432 
72700 
73119 



len ■- 



/41557 
2280 nex = 



Init 
Intr 
Intr 
Term 



1059 1312 

2117 2572 

2790 2930 

3044 3338 



>4455348 

60 



732397 
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Sngl 
5 >4455348 
len = 
Term 

10 Intr 
Intr 
Intr 
Init 

15 >4455348 
len = 
Term 

2 0 Intr 

Intr 
Intr 
Init 

25 >4455348 
len = 
Term 

3 0 Init 

>4455348 
len = 

35 

Term 
Init 

>4455348 

40 

len = 
Sngl 

45 >4455348 
len = 
Term 

50 Init 
>4467094 
len = 

55 

Term 
Init 

>4467094 

60 



919 nex = 

35039 34121 

/31814 

1740 nex = 

46233 45987 

46420 46319 

46639 46514 

46883 46799 

47726 47605 

/11590 

1462 nex = 

48273 48030 

48468 48348 

48653 48558 

48972 48817 

49491 49206 

/18870 

1078 nex = 

53197 52635 
53712 53462 

/39831 

1473 nex = 

53197 52607 
54079 53462 

77873 

550 nex = 

70379 70404 

/42155 

1694 nex = 

75967 75455 
77148 76451 

/32072 

1215 nex = 

111621 110777 
111991 111698 

737966 



1383 



0 



5 

0 
0 
0 
0 
0 



5 

0 
0 
0 
0 
0 



2 

0 
0 



2 

0 
0 



+ 0 



2 

0 
0 



2 

0 
0 
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len = 



Init 
Intr 
Intr 
Term 



113206 113830 

114252 114389 

114536 114760 

114833 115077 



/39781 



Init 
Intr 
Term 



115216 115464 
116303 116503 
116590 116779 



>4467094 



1546 nex = 



Sngl 123453 124998 



Term 
Init 



18966 18835 
19549 19051 



len 



3020 nex 



Term 
Intr 
Intr 
Intr 
Intr 
Intr 
Intr 
Intr 
Intr 
Intr 
Intr 
Intr 
Intr 
Intr 
Init 



20057 
20251 
20444 
20602 
20750 
20936 
21221 
21392 
21572 
21781 
21937 
22163 
22345 
22567 
22820 



19801 
20141 
20333 
20525 
20685 
20832 
21024 
21300 
21480 
21657 
21861 
22042 
22259 
22517 
22674 



>4467094 
len = 



/27692 
678 nex 



Term 
Init 



57759 57398 
58075 57842 



>4467094 
len = 



742757 
1316 nex = 
61776 61514 
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Init 

>4467094 

len = 

Sngl 

>4467094 

len = 

Term 
Intr 
Intr 
Init 

>4467094 

len = 

Term 
Intr 
Intr 
Intr 
Init 

>4467094 

len = 

Sngl 

>4467094 

len = 

Sngl 

>4467131 

len = 

Init 
Intr 
Intr 
Intr 
Term 

>4467131 

len = 

>4467131 



62829 62037 

735382 

654 nex = 

70975 71628 

/18344 

1002 nex = 

71933 71728 

72070 72017 

72246 72141 

72729 72335 

721 

1308 nex = 

75122 74915 

75276 75205 

75455 75351 

75807 75617 

76222 76054 

77275 

67 7 nex = 

89301 88625 

736035 

16 04 nex = 

90226 88623 

714711 

954 nex = 

16249 16304 

16399 16481 

16617 16701 

16800 16881 

16973 17202 

795661 
2264 nex = 

734210 
1513 nex = 



Term 41365 41312 
Intr 41555 41457 
60 Intr 41673 41645 
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2192 



nex ■■ 



Term 
Intr 
Intr 
Intr 
Intr 
Intr 
Intr 
Init 



41365 
41555 
41673 
41919 
42078 
42823 
43052 
43371 



41180 
41457 
41645 
41805 
42013 
42739 
42917 
43235 



>4467131 
len = 



Init 
Term 



>4467131 
len = 



44334 44556 
45193 45766 



Init 
Term 



4668 5085 
5231 6052 



2370 



nex 



Term 
Intr 
Intr 
Intr 
Intr 
Intr 
Intr 
Intr 
Intr 
Init 



67293 
67447 
67646 
67826 
68190 
68614 
68808 
69041 
69249 
69423 



67054 
67379 
67546 
67730 
67911 
68402 
68694 
68936 
69136 
69331 



>4467131 



/33640 
2504 nex 



Term 
Intr 
Init 

>4467131 

len = 

Init 
Intr 
Intr 
Intr 



70027 69544 
70463 70118 
72047 70911 



2290 



nex = 



73093 73341 

73431 73471 

73571 73730 

73817 73992 
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Intr 
Intr 
Term 



74069 74173 
74253 74399 
74474 75115 



>4467131 
len = 



Term 
Intr 
Init 



76622 76090 
77665 76996 
77983 77792 



>4467131 
len = 



Term 
Intr 
Init 



79829 79698 
80958 80259 
81238 81056 



>4468103 
len = 



1595 



nex 



Term 
Intr 
Intr 
Init 

>4468103 

len = 



16660 16132 

17189 16761 

17559 17306 

17726 17639 

/11210 



Term 
Init 



>4468103 
len = 



42434 41751 
43937 42613 



/36616 
2338 nex ■ 



Term 
Intr 
Intr 
Intr 
Intr 
Intr 
Intr 
Intr 
Init 

>4468103 

Len = 



51987 
52425 
52655 
52938 
53108 
53260 
53449 
53625 
54010 



51673 
52370 
52555 
52846 
53030 
53199 
53356 
53537 
53706 



/8606 
2269 nex 



Term 
Intr 
Intr 
Intr 
Intr 
Intr 



60529 
60674 
60854 
61078 
61243 
61453 



60215 
60616 
60790 
60942 
61191 
61376 
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25 



50 



Intr 
Init 



len 



Init 
Intr 

10 Intr 
Intr 
Intr 
Intr 
Term 

15 

>4468801 
len = 
20 Sngl 
>4468801 
len = 



Init 
Term 



Init 
Term 



>4468801 

len = 

4 0 Term 
Init 

>4468801 

4 5 len = 

Sngl 

>4468801 

len = 



Init 
Intr 

55 Intr 
Term 



61736 61670 
62483 62229 

/943 

2505 nex = 

73001 73391 

73776 73852 

74018 74152 

74290 74344 

74437 74524 

74795 74893 

74995 75505 

/41783 
57 4 nex = 
1419 846 

/40095 

2263 nex = 

35213 36127 
36724 37475 

733365 

1750 nex = 

48345 49288 
49374 50086 

/17990 

817 nex = 

55584 55278 
55732 55672 

/121376 

633 nex = 

80126 79494 

72583 

1516 nex = 

82824 83081 

83164 83364 

83438 83603 

83816 84339 



>4468801 



/33704 



6 0 len = 



1125 nex = 



2 
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Term 
Init 



5 >4468801 
len = 



10 



Init 
Intr 
Intr 
Intr 
Intr 
Intr 

15 Term 
>4468976 
len = 

20 

Init 
Intr 
Term 

25 >4468976 
len = 
Sngl 

30 

>4468976 
len = 
35 Sngl 

>4469002 
len = 

40 

Sngl 

>4469002 

4 5 len = 

Term 
Init 

50 >4469002 
len = 
Term 

55 Init 



>4469002 
len = 



91349 90695 
91819 91488 

723732 

2181 nex = 

94811 95064 

95159 95396 

95517 95707 

95803 96038 

96147 96390 

96504 96611 

96769 96991 

/92021 

1690 nex = 

40307 40379 
40472 40537 
40686 40979 

/41011 

640 nex = 

65865 66504 

/207350 

615 nex = 

88477 89091 

739733 

1604 nex = 

11302 9699 

77756 

731 nex = 

15799 15454 
16184 15960 

733380 

1150 nex = 

1303 530 
1678 1406 

720829 

3120 nex = 
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Term 
Intr 
Intr 
Intr 
Intr 
Intr 
Intr 
Intr 
Init 

>4469002 

len = 

Init 
Term 

>4469002 

len = 

Term 
Intr 
Intr 
Intr 
Init 

>4469002 

Term 
Intr 
Intr 
Init 

>4469002 

len = 

Sngl 

>4469002 

len = 

Sngl 

>4469002 

len = 

Term 
Intr 
Intr 
Intr 
Intr 
Init 



17041 
17196 
17550 
17871 
18146 
18282 
18836 
19200 
19691 



16572 
17137 
17459 
17773 
18054 
18225 
18687 
18957 
19451 



20958 21120 
21217 21658 



2411 nex = 

32961 32391 

33385 33177 

33762 33645 

34243 33981 

34801 34389 

/115976 

1533 nex = 

3286 2986 

3441 3370 

3876 3740 

4518 3982 

/8827 

49 0 nex = 

61418 61899 

/12935 

47 3 nex = 

62817 63289 

/14002 

1424 nex = 



76434 
76611 
76843 
77038 
77166 
77596 



76173 
76509 
76705 
76946 
77120 
77275 



60 >4469002 



/24162 
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Init 
Intr 
Term 



79053 79213 
79319 79691 
79762 80345 



>4469002 
len = 



Term 
Init 



>4490291 
len = 



/107201 
1003 nex = 



84417 83587 
84589 84496 



Init 
Term 



30362 30506 
30547 30757 



>4490291 
len = 
Sngl 

>4490291 
len = 



30367 31044 
/107617 
730 nex = 



Init 
Intr 
Term 

>4490291 

len = 



35200 35539 
35630 35718 
35791 35920 



1770 



Term 
Intr 
Intr 
Intr 
Intr 
Intr 
Intr 
Init 



2571 
2837 
3098 
3230 
3408 
3574 
3724 
3994 



2225 
2755 
3032 
3180 
3296 
3461 
3668 
3815 



>4490291 

len = 

Term 
Intr 
Intr 
Intr 
Intr 
Intr 
Init 



1786 nex = 

2571 2225 

2837 2755 

3098 3032 

3230 3180 

3574 3461 

3724 3668 

3995 3821 
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>4490291 
len = 



1453 



5 Init 
Intr 
Intr 
Term 

10 >4490291 



48142 48411 

48498 48686 

48784 48946 

49035 49594 

/19165 



len 



1810 nex ■ 



Term 

15 Intr 
Intr 
Intr 
Init 

20 >4490291 



50287 49800 

50763 50369 

51202 50861 

51411 51280 

51600 51524 

/40596 



2 94 0 nex ■■ 



Term 
Intr 
Intr 
Intr 
Intr 
Intr 
Intr 
Intr 
Init 



69161 
69309 
69681 
69887 
70245 
70842 
71090 
71267 
71890 



68951 
69244 
69649 
69828 
70184 
70771 
71045 
71170 
71358 



Term 
Intr 
Intr 
Intr 
Intr 
Intr 
Intr 
Init 



72209 
72366 
72656 
72877 
73193 
73516 
73720 
74129 



71984 
72294 
72456 
72736 
72966 
73293 
73603 
73817 



len 



737432 
22 66 nex ■■ 



Init 
Intr 
Intr 
Intr 
Term 



79757 
80346 
80743 
81256 
81593 



80261 
80623 
81019 
81501 
82022 



len = 

60 



221 nex = 



1 
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Sngl 

>4490324 

len = 

Sngl 

>4490324 

len = 

Sngl 

>4490324 

len = 

Sngl 

>4490324 

len = 

Sngl 

>4490324 

len = 

Term 
Init 

>4490324 

len = 

Init 
Intr 
Intr 
Intr 
Intr 
Term 

>4490324 

len = 

Init 
Intr 
Intr 
Intr 
Intr 
Term 

>4490324 

len = 

Sngl 



18671 18451 

/150250 

264 nex = 

18806 18543 

/94851 

264 nex = 

30283 30027 

/41363 

709 nex = 

30771 30063 

/119544 

454 nex = 

43886 43433 

/2315 

164 0 nex = 

44776 43494 
45133 44870 

/21759 

1815 nex = 

57939 58220 

58586 58785 

58863 59003 

59082 59162 

59290 59414 

59498 59753 

737985 

1820 nex = 

57939 58220 

58586 58785 

58863 59003 

59082 59162 

59290 59414 

59498 59758 

/17181 

812 nex = 

61230 61139 
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>4490324 
len = 

5 

Sngl 
>4490324 
10 len = 

Sngl 
>4490324 
len = 



15 



/10140 
550 nex = 
72094 71548 
/14973 
508 nex = 
72740 73247 
/117474 



1555 



nex ■■ 



Term 
Intr 

20 Intr 
Intr 
Init 



82576 82369 

82830 82661 

83046 82926 

83406 83125 

83798 83526 



>4490701 



len ■ 



/6867 
1007 nex ■■ 



Term 
Intr 

30 Intr 
Init 



14504 14298 

14711 14599 

14909 14780 

15294 15202 



>4490701 
35 len = 



74374 



Term 
Intr 
Intr 

40 Init 



14504 14319 

14711 14599 

14909 14780 

15294 15202 



>4490701 
len = 

45 

Term 
Intr 
Intr 
Intr 

5 0 Init 
>4490701 
len = 

55 

Term 
Intr 
Intr 
Intr 

60 Intr 



1297 



nex 



14504 14300 

14711 14599 

14909 14780 

15303 15202 

15596 15529 



739825 



2359 



nex 



22765 22319 

22967 22931 

23112 23056 

23485 23327 

23674 23573 
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Intr 
Intr 
Init 



23973 23864 
24351 24252 
24677 24452 



2350 



nex 



Term 
Intr 
Intr 
Intr 
Intr 
Intr 
Init 



22765 
22967 
23112 
23485 
23674 
23973 
24351 



22390 
22931 
23056 
23327 
23573 
23864 
24252 



1848 nex 



Term 
Intr 
Intr 
Intr 
Init 



25202 24945 

25336 25283 

25665 25461 

25913 25833 

26792 26587 



/8812 



1259 



nex 



Term 
Intr 
Intr 
Intr 
Init 



101847 101489 

102100 101932 

102370 102201 

102632 102454 

102744 102713 



735272 



3745 



Term 
Intr 
Intr 
Intr 
Intr 
Intr 
Intr 
Intr 
Intr 
Intr 
Intr 
Init 



101847 
102100 
102370 
102632 
102778 
103035 
103512 
103702 
103878 
104125 
104415 
105069 



101520 
101932 
102201 
102454 
102713 
102852 
103214 
103602 
103794 
103972 
104219 
104785 



Init 
Intr 
Term 



66866 67420 
67516 67595 
67696 68334 
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len = 1930 nex = 

Term 74823 74205 

Intr 75637 75441 

Intr 75819 75719 

Init 76126 75910 

90717 /8286 



1301 



nex 



1 5 Term 
Intr 
Intr 
Init 

20 >4490734 



86445 86257 

86683 86641 

87034 86928 

87557 87378 

/16697 



Term 99947 99849 

Intr 100104 100038 

Intr 100447 100205 

Intr 100749 100543 

Init 101411 100828 



>4490734 

len = 

Init 
Intr 
Intr 
Intr 
Term 

>4490734 
len = 
Sngl 

>4490734 
len = 



179 6 nex = 

37430 37611 

37696 37769 

38182 38305 

38387 38698 

38786 39225 

/20457 

761 nex = 

42123 42883 

/123060 

2073 nex = 



Term 
Intr 
Intr 
Intr 
Intr 
Intr 
Init 



46245 45961 

46442 46334 

46578 46536 

46988 46695 

47215 47145 

47426 47318 

48033 47855 



/37341 



6 0 len = 



19 08 nex = 



10 
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Term 
Intr 
Intr 
Intr 
Intr 
Intr 
Intr 
Intr 
Intr 
Init 

>4490734 

len = 

Init 
Intr 
Intr 
Intr 
Term 

>4490734 

len = 

Init 
Intr 
Intr 
Term 

>4490734 

len = 

Init 
Term 

>4490734 

Sngl 

>4490734 

len = 

Term 
Intr 
Init 

>4490734 

len = 



74312 
74472 
74624 
74765 
74914 
75047 
75206 
75344 
75462 
76069 



74162 
74426 
74565 
74706 
74859 
75014 
75132 
75294 
75418 
75542 



/24741 

2473 nex = 

77167 77764 

77835 78112 

78224 78500 

78836 79081 

79167 79639 

/19562 

1739 nex = 

77929 78112 

78224 78500 

78836 79081 

79167 79667 

/8545 

1258 nex = 

80276 80540 
80921 81533 

/121755 

936 nex = 

91697 90762 

79448 

1168 nex = 

93687 93247 
93849 93773 
94003 93932 

74532 

1270 nex = 



60 



Term 
Intr 
Init 



93687 93204 
93849 93773 
94003 93932 
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1275 



nex 



Term 
Intr 
Intr 
Init 



93687 93284 

93849 93773 

94003 93932 

94558 94467 



/2670 



1412 nex = 



Term 
Intr 
Intr 
Init 

>4490734 

len = 



95295 94885 

95484 95394 

95663 95587 

95822 95751 

/8313 

741 nex = 



Sngl 96873 97152 
>4490734 722456 



Term 
Init 



>4510323 
len = 



97920 97406 
98207 98150 



/29630 
169 9 nex 



Init 
Intr 
Intr 
Intr 
Intr 
Intr 
Intr 
Term 



116587 
116800 
117017 
117204 
117393 
117588 
117770 
118019 



116710 
116892 
117121 
117303 
117484 
117668 
117838 
118285 



1650 



Init 
Intr 
Intr 
Intr 
Intr 
Intr 
Intr 
Term 



116589 
116800 
117017 
117204 
117393 
117588 
117770 
118019 



116710 
116892 
117121 
117303 
117484 
117668 
117838 
118238 



>4510323 
len = 



77542 
1518 nex ■ 
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Term 
Intr 
Intr 
Init 



20387 20042 

20565 20495 

20825 20709 

21559 20906 



>4510323 
len = 

>4510323 
len = 
Sngl 

>4510323 
len = 



/36518 
2110 nex = 
/16583 
370 nex = 
53962 53601 
/11468 
18 87 nex = 



Init 
Intr 
Intr 
Intr 
Intr 
Intr 
Term 

>4510323 

len = 



56611 56922 

57316 57415 

57509 57576 

57795 57847 

57934 57982 

58057 58104 

58233 58497 

728726 

1150 nex = 



Term 
Intr 
Init 



64164 63855 
64685 64611 
65000 64851 



>4510323 
len = 



1302 nex 



Init 
Intr 
Intr 
Intr 
Term 

>4510323 

len = 



84573 84896 

84980 85098 

85207 85395 

85480 85531 

85613 85874 

/21074 

1418 nex = 



Term 
Intr 
Init 



85969 85508 
86191 86055 
86925 86535 



>4510323 
len = 



/13256 
1850 nex = 



Term 94960 94737 
60 Intr 95223 95079 
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1400 







Init 


96586 


95450 


_ 


0 






>4510338 


/96414 












1937 


nex = 












28550 


28235 










Intr 


29151 


29012 


- 


0 








30171 


29548 








10 
















>4510338 


724255 










len = 


971 


nex = 


1 






15 


Sngl 


34528 


33558 


_ 


0 






>4510338 


/2576 










len = 


2237 


nex = 


9 






20 
















Init 


36933 


37151 


+ 


0 






Intr 


37427 


37542 


+ 


0 






Intr 


37638 


37697 


+ 


0 






Intr 


37834 


38016 


+ 


0 




2 5 


Intr 


38097 


38201 




0 






Intr 


38291 


38387 


+ 


0 






Intr 


38479 


38586 


+ 


0 






Intr 


38668 


38803 


+ 


0 


y'l 




Term 


38888 


39169 


+ 


0 




3 0 












r;i 




>4510338 


/156731 






r 




len = 


1390 


nex = 


2 






35 


Init 


44273 


44363 


+ 


0 


pi 




Term 


45034 


45662 




0 






>4510338 


/114613 








40 


len = 


1210 


nex = 


0 








>4510338 


/5167 










len = 


2170 


nex = 


5 






45 
















Init 


52854 


52965 




0 






Intr 


53527 


53789 




0 






Intr 


53878 


54089 


+ 


0 






Intr 


54172 


54439 


+ 


0 




50 


Term 


54516 


55019 


+ 


0 






>4510338 


/3668 










len = 


1690 




2 






55 
















Init 


56164 


56327 


+ 


0 






Term 


57130 


57851 


+ 


0 



>4510338 

60 



/2813 
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1832 



nex : 



Init 
Intr 
Intr 
Intr 
Intr 
Intr 
Term 

>4510338 



68165 
68439 
68750 
68919 
69117 
69373 
69637 



68340 
68646 
68835 
69020 
69250 
69421 
69988 



/12251 



len = 1059 nex = 
Sngl 71208 70150 
>4510360 /39035 



len = 

Term 
Intr 
Intr 
Intr 
Intr 
Intr 
Intr 
Init 



2077 

101338 
101452 
101719 
102088 
102369 
102562 
102944 
103096 



nex = 

101020 
101406 
101533 
101798 
102175 
102461 
102645 
103017 



len ^ 



1536 



nex 



Term 102088 101812 

Intr 102369 102175 

Intr 102562 102461 

Intr 102944 102645 

Init 103095 103017 



40 >4510360 
Term 

4 5 Intr 
Intr 
Intr 
Intr 
Intr 

50 init 



/21707 



2395 

101338 
101452 
101719 
102088 
102369 
102562 
102944 



nex = 

101025 
101406 
101533 
101798 
102175 
102461 
102645 



>4510360 728577 

len = 1009 nex = 

Term 103942 103748 

intr 104361 104019 

init 104756 104575 



60 >4510360 



722535 
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Term 
Intr 
Init 

>4510360 

len = 

Term 
Intr 
Init 

>4510360 

len = 

Init 
Intr 
Intr 
Term 

>4510360 

Init 
Intr 
Intr 
Intr 
Intr 
Intr 
Intr 
Term 

>4510360 

len = 

Sngl 

>4510360 

Sngl 
>4510360 
len = 

Sngl 
>4510392 



105113 104790 
105546 105210 
106116 105944 

/39503 

1090 nex = 

107026 106783 
107457 107118 
107872 107683 

/29951 

1349 nex = 



12868 
13411 
13677 
13876 



13060 
13550 
13802 
14216 



/13310 
2350 



31440 
31864 
32001 
32534 
32698 
32865 
33109 
33299 



nex = 

31565 
31919 
32107 
32590 
32761 
32999 
33192 
33781 



/26418 
874 nex = 
46437 45564 
/19481 
806 nex = 
49946 49141 
/91908 
732 nex = 
5525 6256 

/16423 
1588 nex = 



60 



Term 



35088 34032 



0 
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Intr 
Intr 
Intr 
Intr 
Term 



46949 
47177 
47334 
47531 
47737 



47090 
47246 
47399 
47653 
48053 



>4512646 
len = 



Term 
Init 



48617 48420 
50051 49809 



>4512646 
len = 



1430 nex 



Term 
Init 



49533 49412 
50053 49809 



>4512646 
len = 



Term 
Intr 
Init 



48617 48428 
49533 49412 
50057 49809 



>4512646 
len = 



Term 
Intr 
Init 

>4512646 



48617 48425 
49533 49412 
50093 49809 



2050 



Term 
Intr 
Intr 
Intr 
Intr 
Intr 
Intr 
Init 

>4512656 

len = 

Term 
Intr 
Intr 
Init 



50564 
50835 
51046 
51222 
51577 
51893 
52086 
52376 



50329 
50657 
50938 
51137 
51518 
51678 
51973 
52179 



/41712 

1391 nex = 

105749 105453 

105920 105838 

106277 106194 

106843 106633 



>4512656 

60 



/125409 
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Sngl 

>4512656 

len = 

Term 
Intr 
Intr 
Intr 
Intr 
Intr 
Init 

>4512656 
len = 
Sngl 

>4512656 
len = 



971 nex = 

14193 15163 

/35051 

2301 nex = 

23482 23178 

23695 23596 

24031 23903 

24198 24130 

24399 24304 

24814 24499 

25478 25178 

/102453 

581 nex = 

44496 45076 

742863 

1078 nex = 



Term 
Init 

>4512656 

len = 

Init 
Intr 
Term 

>4512656 

len = 



47194 46904 
47981 47309 

/41214 

1304 nex = 

54263 54446 
54531 54981 
55077 55566 

/41345 

1301 nex = 



Init 
Term 



>4512656 
len = 



54266 54981 
55077 55566 



/99303 
1363 nex ■ 



Init 
Intr 
Term 

>4512656 

len = 

Sngl 



54266 54446 
54876 54981 
55077 55628 

/156913 

550 nex = 

54268 54446 



>4512656 

60 



/157512 
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len = 

Sngl 

>4512656 

len = 

Term 
Intr 
Init 



len = 

Init 
Intr 
Intr 
Intr 
Intr 
Intr 
Intr 
Intr 
Intr 
Intr 
Intr 
Intr 
Term 

>4512656 

len = 

Init 
Intr 
Intr 
Intr 
Term 

>4512656 

len = 

Init 
Intr 
Intr 
Intr 
Term 

>4512656 

len = 

Sngl 

>4512656 



55360 55524 
/35604 



58659 58568 
58861 58780 
61159 60918 

/17416 

3176 nex = 



75697 
76075 
76358 
76651 
76872 
77055 
77316 
77507 
77677 
77840 
78076 
78255 
78474 



75985 
76255 
76443 
76755 
76934 
77164 
77435 
77582 
77750 
77960 
78173 
78378 
78872 



/21855 
17 06 nex ■■ 



81740 
82425 
82665 
82880 
83024 



82100 
82580 
82797 
82919 
83445 



1721 nex = 

81740 82100 

82425 82580 

82665 82797 

82880 82919 

83024 83460 

/26194 

610 nex = 

84529 85135 

/17723 



len = 

60 



1334 nex = 



5 
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Term 
Intr 
Intr 
Intr 
Init 



88998 88678 

89288 89093 

89677 89635 

89871 89764 

90011 89950 



>4512656 
len = 



/2227 



Term 
Intr 
Init 



98186 98145 
98404 98327 
99420 98801 



>4512690 
len = 



1930 nex = 



Init 
Intr 
Intr 
Term 



13553 13670 

13990 14118 

14808 14995 

15104 15473 



/20429 



len ■■ 



3383 



nex = 



Term 
Intr 
Intr 
Intr 
Intr 
Intr 
Intr 
Intr 
Intr 
Intr 
Init 



22804 
23013 
23189 
23389 
23599 
24010 
24292 
24490 
24696 
24878 
25461 



22366 
22893 
23121 
23266 
23484 
23879 
24222 
24406 
24611 
24814 
25365 



len = 



2530 



Term 
Intr 
Intr 
Intr 
Intr 
Intr 
Intr 
Intr 
Intr 
Intr 
Init 



4054 
4266 
4517 
4741 
4977 
5158 
5345 
5505 
5714 
5907 
6307 



3778 
4144 
4353 
4607 
4825 
5084 
5268 
5441 
5603 
5797 
6091 



>4512690 
len = 



/30611 
17 8 6 nex 



60 Init 



72103 72432 



0 
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Intr 
Intr 
Intr 
Term 



72543 72616 

73261 73335 

73495 73521 

73686 73888 



727629 
970 nex 



Term 
Intr 
Init 



74223 74064 
74366 74301 
75026 74717 



>4512690 
len = 



2086 nex = 



Term 
Intr 
Intr 
Intr 
Init 



84416 
84712 
84919 
85308 
86222 



84137 
84524 
84815 
85275 
85706 



>4512690 
len = 



2026 nex 



Term 
Intr 
Intr 
Intr 
Intr 
Intr 
Init 



7610 
8243 
8397 
8529 
8713 
8966 
9212 



7394 
7704 
8338 
8479 
8646 
8823 
9088 



>4519183 
len = 



Init 
Term 



/205761 
1035 nex ■■ 



26007 26147 
26837 27041 



>4519183 
len = 



73325 
1098 nex 



Init 
Intr 
Term 



31470 31599 
31784 31867 
32242 32567 



>4519183 
len = 



725308 
1078 nex = 



Init 
Intr 



31473 31599 
31784 31867 
32242 32550 



>4519183 

60 



7111672 
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1314 nex 



20 



35 



45 



Init 


42016 


42331 


+ 


0 


Intr 


42423 


42478 


+ 


0 


Intr 


42633 


42692 


+ 


0 


Intr 


42768 


42845 


+ 


0 


Intr 


42937 


43066 


+ 


0 


Intr 


43141 


43193 


+ 


0 


Term 


43279 


43329 


+ 


0 



10 

>4519183 
len = 
15 >4519183 
len = 



Init 
Intr 
Term 



>4519183 

25 len = 

Init 
Intr 
Intr 

3 0 Intr 
Intr 
Term 



>4519183 
len = 
Sngl 

40 >4519186 
len = 
Sngl 
>4519187 
len = 



50 



55 



Init 
Intr 
Intr 
Intr 
Term 



>4519187 
len = 
6 0 Term 



/3201 

1104 nex = 

/36752 

977 nex = 

55410 55539 
55654 55936 
56039 56386 

/21404 

2255 nex = 

65304 65895 

65972 66016 

66102 66215 

66686 66766 

67046 67213 

67358 67558 

72875 

2 51 nex = 

73445 73195 

734522 

574 nex = 

11854 12427 

720618 

1873 nex = 

26310 26574 

26838 27409 

27503 27565 

27662 27713 

27800 28182 

714357 

1270 nex = 

31639 31240 



Reference No. 2750-942P 



Intr 
Init 



32336 32265 
32505 32367 



>4519187 

5 

len = 
Sngl 

10 >4519187 
len = 



/104871 
350 nex = 
32958 32609 
72924 
1734 nex = 



15 



Term 31639 31291 
Intr 32336 32265 
Init 33024 32367 



>4519187 729696 
20 len = 1707 nex 



Init 
Intr 
Intr 

25 Intr 
Intr 
Intr 
Term 

30 >4519187 
len = 



48852 48942 

49200 49468 

49539 49643 

49730 49816 

49902 49960 

50084 50237 

50302 50558 

725264 

2718 nex = 



35 



Init 
Intr 
Intr 
Intr 
Intr 
Intr 

4 0 Intr 
Term 

>4519188 

4 5 len = 

Term 
Init 

50 >4519188 
len = 



53157 53318 

53816 53957 

54090 54186 

54263 54332 

54428 54532 

54613 54711 

55041 55128 

55241 55522 

717098 



14912 14578 
15424 14997 



1436 nex ■■ 



Init 
Intr 
Intr 
Intr 
Intr 
Term 



29371 29651 

29739 29792 

29890 29970 

30055 30124 

30331 30419 

30560 30806 
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>4519188 
len = 



1462 



Init 
Intr 
Intr 
Intr 
Intr 
Term 

>4519190 

len = 



29379 29651 

29739 29792 

29890 29970 

30055 30124 

30331 30419 

30560 30840 

/158397 



Init 
Term 



31318 31483 
31775 32418 



>4519190 

len = 

Init 
Intr 
Intr 
Intr 
Intr 
Intr 
Term 

>4519191 

len = 



3270 nex = 

37427 38279 

38619 38903 

38977 39195 

39285 39464 

39540 40031 

40123 40284 

40365 40696 

726549 



Init 
Term 



35516 35748 
35845 36249 



>4519191 
len = 



Term 
Init 



>4519191 
len = 



Term 
Init 



/35233 
1554 nex ■ 



37318 36349 
37902 37399 



75748 
730 nex 



38416 37997 
38725 38497 



>4519191 
len = 



797675 
1914 nex 



Init 
Intr 
Intr 
Term 



45399 45522 

45674 45748 

46594 46904 

47000 47312 
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Init 47579 48385 
Intr 48874 49077 
Term 49162 49696 



3780 nex 



Init 
Intr 
Intr 
Intr 
Intr 
Intr 
Intr 
Intr 
Intr 
Intr 
Intr 
Intr 
Term 



59272 
59565 
59866 
60122 
60403 
60679 
61199 
61732 
61874 
62161 
62359 
62521 
62794 



59472 
59603 
60033 
60227 
60569 
60921 
61287 
61782 
61964 
62265 
62429 
62616 
63051 





len = 


407 


nex = 


1 


30 












Sngl 


23640 


24046 






>4519192 


/37019 




35 


len = 


2212 




8 




Init 


34427 


34741 






Intr 


34950 


35008 


+ 




Intr 


35106 


35180 


+ 


40 


Intr 


35255 


35368 


+ 




Intr 


35462 


35532 


+ 




Intr 


35652 


35949 


+ 




Intr 


36042 


36331 


+ 




Term 


36419 


36638 


+ 


45 












>4519192 


/5819 








771 




1 


50 


Sngl 


3537 


2767 






>4519192 


/29301 






len = 


2235 


nex = 


11 


55 












Init 


39652 


39840 






Intr 


40077 


40138 


+ 




Intr 


40278 


40380 


+ 




Intr 


40499 


40557 


+ 


60 


Intr 


40641 


40664 


+ 
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Intr 
Intr 
Intr 
Intr 
5 Intr 
Term 

>4519192 

10 len = 

Term 
Intr 
Intr 

15 Intr 
Intr 
Init 

>4519192 

20 

len = 

Term 
Intr 

25 Intr 
Intr 
Init 



>4519192 



Init 
Intr 

35 Intr 
Intr 
Intr 
Intr 
Term 

40 

>4519192 

len = 

45 Init 
Intr 
Intr 
Intr 
Intr 

5 0 Term 
>4519193 
len = 

55 

Sngl 
>4519193 



40784 40866 

40968 41032 

41147 41196 

41287 41388 

41473 41532 

41634 41886 

/21955 

2209 nex = 

46248 45906 

46484 46326 

47033 46716 

47390 47130 

47555 47478 

48114 47633 

7533 

1489 nex = 

48529 48236 

48841 48619 

49138 48966 

49404 49245 

49705 49484 

/31447 

1930 nex = 



76620 
77019 
77315 
77567 
77855 
78180 
78423 



76794 
77233 
77472 
77754 
78107 
78328 
78546 



/33530 

1896 nex = 

861 1342 

1549 1603 

1757 1818 

1914 2020 

2119 2206 

2324 2756 

722588 

1210 nex = 

10581 11344 

712734 



60 len = 



3093 nex = 



15 
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Term 
Intr 
Intr 
Intr 
Intr 
Intr 
Intr 
Intr 
Intr 
Intr 
Intr 
Intr 
Intr 
Intr 
Init 

>4519193 

len = 

Sngl 

>4519193 

len = 

Sngl 

>4519193 

Sngl 

>4519193 

len = 

Term 
Intr 
Intr 
Intr 
Intr 
Init 

>4519193 
len = 
Sngl 

>4519194 
len = 



11783 
11954 
12113 
12282 
12459 
12649 
12798 
12989 
13146 
13398 
13650 
13862 
14095 
14272 
14591 



11499 
11871 
12043 
12213 
12367 
12556 
12737 
12945 
13084 
13333 
13534 
13742 
13965 
14240 
14446 



72934 
610 nex = 
29932 29326 
/101342 
310 nex = 
43149 43454 
/31309 
763 nex = 
51604 50842 
738326 
2324 nex = 



75991 
76453 
76876 
77135 
77307 
77981 



75658 
76077 
76829 
76973 
77229 
77783 



7114703 
576 nex = 
7839 8414 

720539 
1577 nex = 



60 



Init 
Intr 
Term 



15272 15482 
15570 15615 
16443 16848 
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len = 1407 nex = 

Init 16952 17134 

Intr 17217 17260 

Intr 17389 17484 

Intr 17694 17786 

Intr 17919 18006 

Intr 18095 18156 

Term 18245 18358 

>4519194 732257 

len = 910 nex = 

Sngl 41773 40872 

>4519194 /9170 

len = 1599 nex = 

Term 48741 48223 

Intr 48949 48918 

Intr 49210 49075 

Init 49821 49647 

>4519194 /36525 

len = 1953 nex = 

Sngl 59919 61346 

>4519194 77233 

len = 1270 nex = 

Term 58594 68376 

Intr 68973 68686 

Intr 69272 69107 

Init 69644 69537 

>4519194 7819 

len = 1270 nex = 

Init 7404 7643 

Intr 7741 8080 

Intr 8177 8230 

Term 8376 8673 

>4519194 76181 

len = 1810 nex = 

Init 74654 74757 

Intr 75098 75146 

Intr 75472 75579 

Intr 75770 75884 

Intr 75966 76083 
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Term 

>4519194 

5 len = 

Init 
Term 

10 >4519195 



76205 76459 

/13954 

1019 nex = 

9367 9653 
9736 10385 



3889 



nex ■ 



Term 
Intr 
Intr 
Intr 
Intr 
Intr 
Intr 
Intr 
Intr 
Intr 
Intr 
Intr 
Intr 
Intr 
Intr 
Init 



12207 
12457 
12656 
12841 
13213 
13440 
13706 
13944 
14169 
14435 
14556 
14691 
14893 
15044 
15431 
15719 



11831 
12314 
12530 
12762 
12962 
13312 
13657 
13915 
14056 
14374 
14514 
14636 
14836 
14976 
15352 
15536 



2377 



nex = 



Term 
Intr 
Intr 
Intr 
Intr 
Init 



24714 
24903 
25545 
26012 
26589 
26817 



24441 
24813 
25430 
25683 
26090 
26772 



>4519195 

len = 

Init 
Intr 
Intr 
Intr 
Intr 
Term 



1753 nex 



42376 
42664 
42979 
43253 
43622 
43875 



42569 
42895 
43142 
43535 
43788 
44128 



794524 
475 nex 



Init 
Term 



43654 43788 
43875 44128 



60 >4519195 



/20767 
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1874 nex = 



Term 


47286 


46916 


0 


Intr 


47435 


47386 


0 


Intr 


47604 


47531 


0 


Intr 


48033 


47989 


0 


Intr 


48249 


48134 


0 


Init 


48789 


48477 


0 



>4519195 

len = 

Term 
Init 

>4519197 

len = 

Sngl 

>4521999 

len = 

Sngl 

>4521999 

len = 

Sngl 

>4521999 

len = 

Sngl 

>4521999 

len = 

Sngl 

>4521999 

len = 

Sngl 

>4521999 

len = 

Sngl 

>4521999 



/97350 

613 nex = 

67803 67627 
68239 67907 

732643 

1763 nex = 

27667 25905 

/98976 

430 nex = 

10245 10673 

/109109 

399 nex = 

10295 10686 

/106320 

278 nex = 

10379 10656 

/115957 

295 nex = 

10445 10739 

/114949 

285 nex = 

10447 10731 

/10836 

562 nex = 

12451 11890 

/151572 
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len = 1914 nex = 

Term 22290 22087 

Intr 22510 22406 

Intr 22867 22604 

Intr 23125 22964 

Intr 23303 23226 

Init 24000 23670 

>4521999 /31655 

len = 1630 nex = 

Term 26057 25794 

Intr 26219 26143 

Intr 26450 26332 

Intr 26899 26820 

Init 27415 27008 

>4521999 /92908 

len = 2017 nex = 

Term 31465 31171 

Intr 31816 31636 

Intr 32453 32089 

Init 33187 33057 

>4521999 /34062 

len = 1139 nex = 

Term 38591 38301 

Intr 38843 38709 

Init 39439 39192 

>4521999 73948 

len = 1096 nex = 

Sngl 9611 9785 

>4521999 795453 

len = 1044 nex = 

Init 9613 9785 

Intr 10095 10272 

Term 10295 10656 

>4521999 717347 

len = 1074 nex = 

Sngl 9613 10686 

>4521999 7125485 

len = 1115 nex = 
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Init 
Term 



9613 9785 
10095 10272 



>4521999 

len = 

Init 
Intr 
Term 



/125977 

1120 nex = 

9613 9785 
10095 10272 
10295 10732 



>4521999 
len = 



798855 
1126 nex ^ 



Init 
Intr 
Term 

>4521999 

len = 

Init 
Term 



9613 9785 

10095 10272 

10295 10738 

/7420 

98 6 nex = 

9615 9785 

10095 10600 



>4521999 
len = 



7234 
1118 nex 



Init 
Term 



9615 9785 
10095 10272 



>4521999 
len = 



726796 
1119 nex 



Init 
Intr 
Term 



9615 9785 
10095 10272 
10295 10733 



>4521999 
len = 



Init 
Term 



>4522002 
len = 



/13879 
1121 nex 



9615 9785 
10095 10735 



/18876 
2 09 7 nex 



Init 
Intr 
Intr 
Term 



31368 31702 

31777 31877 

32003 32211 

32776 33464 



>4522002 

60 



/8550 
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len 



Sngl 
>4522002 
len = 



36522 35979 
/10080 



2566 



nex ■ 



Term 
Intr 
Intr 
Intr 
Intr 
Intr 
Init 



37918 37569 

38093 38029 

38349 38274 

38569 38442 

38766 38659 

39324 39168 

40134 39924 



>45220G2 
len = 



/35371 
1570 nex • 



Term 
Intr 
Init 



70494 70283 
70779 70573 
71849 71563 



>4531433 
len = 



/142861 
940 nex ■■ 



Sngl 
>4531433 
len " 



33855 32916 
/16421 



1991 



nex 



Term 
Intr 
Intr 
Intr 
Intr 
Intr 
Intr 
Init 

>4531433 
len = 
Sngl 

>4538895 
len = 



35017 
35195 
35388 
35569 
35771 
35976 
36161 
36341 



34361 
35098 
35286 
35484 
35684 
35863 
36073 
36247 



/41669 
276 nex = 
37215 36940 
77395 
2470 nex = 



Term 
Intr 
Intr 
Intr 
Intr 
Init 



47375 
47611 
47817 
48199 
48407 
49561 



47093 
47463 
47700 
48007 
48286 
49113 
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>4538895 
len = 
Sngl 

>4538895 
len = 



78795 
756 nex = 
49466 48947 
/24071 
1392 nex = 



Init 
Intr 
Term 



72494 72654 
72892 73080 
73174 73422 



>4538895 
len = 



/27800 
790 nex ■ 



Init 
Intr 
Term 



72493 72654 
72892 73080 
73174 73280 



>4538895 
len = 



/19582 
1899 nex ■ 



Term 
Init 



6321 5522 
7420 6901 



>4538895 
len = 



725257 
1939 nex = 



Term 
Init 



6321 5516 
7454 6901 



>4538895 



7142381 
1954 nex 



Init 
Intr 
Term 



87087 87340 
88405 88485 
88572 89040 



>4538918 
len = 



733495 
2564 nex = 



Init 
Intr 
Intr 
Intr 
Intr 
Term 



35043 35417 

35652 35904 

36064 36212 

36295 36552 

36711 36902 

37042 37606 



>4538918 
len = 



715410 
1705 nex 



60 



Term 



52803 52534 



0 
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Intr 
Intr 
Init 

>4538918 

len = 

Init 
Intr 
Intr 
Intr 
Intr 
Intr 
Term 



53006 52889 
53214 53100 
53440 53336 



2096 

55471 
56096 
56317 
56641 
56932 
57239 
57378 



nex = 

55645 
56230 
56541 
56823 
57155 
57284 
57566 



>4538918 
len = 



3130 



nex 



Init 
Intr 
Intr 
Intr 
Intr 
Intr 
Intr 
Intr 
Intr 
Intr 
Intr 
Term 

>4538918 
len = 
Sngl 

>4538918 
len = 



70373 
70658 
70897 
71153 
71398 
71582 
71733 
72056 
72312 
72493 
72788 
73209 



70535 
70701 
70992 
71245 
71488 
71643 
71917 
72233 
72355 
72682 
72805 
73502 



/7421 
1241 nex = 
79061 77821 
/148597 
1482 nex = 



Init 
Intr 
Intr 
Term 

>4538949 

len = 



93052 93292 

93665 93847 

94012 94176 

94312 94533 

/13231 

738 nex = 



Init 
Intr 
Term 



32100 52260 
52368 52518 
52605 52837 



len = 

60 



1486 nex = 



6 
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Init 


53668 


53951 


+ 


Intr 


54035 


54119 


+ 


intr 


54209 


54259 


+ 


Intr 


54358 


54478 


+ 


Intr 


54585 


54719 


+ 


Term 


54798 


55153 


+ 



>4538949 
len = 
Sngl 

>4538949 
len = 



1825 



Init 
Intr 
Intr 
Intr 
Intr 
Intr 
Term 

>4538949 

len = 



66408 
66590 
66808 
67001 
67205 
67362 
67521 



66496 
66728 
66882 
67122 
67280 
67421 
67843 



736655 
3584 nex ■ 



Term 
Intr 
Intr 
Init 



77287 76801 

77426 77365 

78110 77691 

80384 79836 



>4538972 
len = 



2801 



Term 
Intr 
Intr 
Intr 
Intr 
Intr 
Intr 
Intr 
Intr 
Intr 
Intr 
Intr 
Init 



11535 
11698 
11841 
12019 
12214 
12391 
12711 
12872 
13313 
13530 
13703 
13870 
14145 



11345 
11621 
11769 
11939 
12164 
12297 
12631 
12788 
13234 
13416 
13633 
13786 
13945 



Init 
Intr 
Intr 
Intr 



/41992 

2052 nex = 

32316 32561 

32643 32715 

33489 33697 

33789 33935 
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Term 
>4538972 
len = 
Sngl 
>4538972 
len = 



34034 34367 
735962 
823 nex = 

35958 36389 
/31115 



1478 



nex 



Init 
Intr 
Term 

>4538990 

len = 



3879 4068 
4258 4391 
5063 5356 

/32791 

4390 nex = 



Term 


22701 


22300 




Intr 


22820 


22779 




Intr 


22937 


22896 




Intr 


23128 


23029 




Intr 


23284 


23223 




Intr 


23456 


23375 




Intr 


25216 


25034 




Init 


26686 


26545 





>4538990 
len = 
Sngl 

>4538990 

len = 

Term 
Intr 
Intr 
Intr 
Intr 
Intr 
Intr 
Init 

>4538990 

len = 

Term 
Intr 
Intr 
Intr 
Intr 
Intr 
Intr 
Intr 



/36491 
533 nex = 
4200 4732 

/40182 
2574 nex = 



47130 
47313 
47449 
47627 
47778 
47991 
48467 
48934 



2984 

56364 
56633 
56834 
57028 
57314 
57443 
57592 
57780 



47007 
47244 
47400 
47560 
47717 
47868 
48429 
48614 



nex = 

56049 
56462 
56722 
56963 
57213 
57391 
57535 
57682 
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Intr 
Intr 
Intr 
Init 

>4539290 

len = 

Init 
Intr 
Term 

>4539290 

len = 

Init 
Term 

>4539290 

len = 

Init 
Intr 
Intr 
Intr 
Term 

>4539290 

len = 

Term 
Init 

>4539290 
len = 
Sngl 

>4539290 

len = 

Term 
Init 

>4539290 

len = 

Init 
Intr 
Intr 
Intr 
Intr 
Intr 



57898 57855 

58299 58209 

58481 58455 

59032 58791 

/9546 

1353 nex = 

31019 31261 
31721 31841 
31919 32371 

/111551 

68 0 nex = 

31692 31841 
31919 32371 

/98650 

1931 nex = 

36398 36658 

36936 37160 

37542 37689 

37807 37886 

37970 38328 

76966 

955 nex = 

57171 56874 
57828 57579 

/16673 

976 nex = 

57850 56875 

/31982 

1005 nex = 

57171 56874 
57878 57579 

/37817 

2214 nex = 

58762 59031 

59151 59225 

59572 59628 

59712 59866 

59985 60119 

60211 60313 
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Intr 


60402 


60501 


+ 


0 




Term 


60611 


60975 








>4539290 


/103513 






5 














len = 


198 


nex = 








Sngl 


60820 


61017 


+ 


0 


10 


>4539290 


/10511 








len = 


1487 


nex = 


5 






Term 


4951 


4724 






15 


Intr 


5226 


5106 


- 


0 




Intr 


5588 


5309 




0 




Intr 


6021 


5973 




0 




Init 


6210 


6132 


- 


0 


20 


>4539290 


/24045 








len = 


713 


nex = 


2 






Init 


86190 


86240 




0 


25 




86397 


86784 


+ 


0 




>4539309 


/31833 








len = 


1938 


nex = 


3 




30 














Init 


23367 


23550 


+ 


0 




Intr 


24243 


24528 


+ 


0 




Term 


24866 


25304 


+ 


0 


35 


>4539309 


/40069 








len = 


1415 


nex = 


1 






Sngl 


28842 


30256 


+ 


0 


40 














>4539309 


/119970 








len = 


1487 




6 




45 


Term 


40137 


39707 


- 


0 




Intr 


40406 


40272 








Intr 


40622 


40484 




0 




Intr 


40793 


40703 


_ 


0 




Intr 


41003 


40894 


_ 


0 


50 


Init 


41193 


41096 


_ 


0 




>4539309 


/6351 








len = 


2193 


nex = 


7 




55 














Term 


40137 


39693 




0 




Intr 


40406 


40272 




0 




Intr 


40622 


40484 




0 




Intr 


40793 


40703 




0 


60 


Intr 


41003 


40894 




0 
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Intr 
Init 

>4539309 

Len = 

Term 
Intr 
Init 

>4539309 

len = 

Term 
Init 

>4539309 

len = 

Term 
Init 

>4539309 
len = 
Sngl 

>4539309 

len = 

Term 
Intr 
Intr 
Intr 
Init 

>4539331 

len = 

Term 
Intr 
Init 

>4539353 

len = 

Sngl 

>4539353 

len = 

Sngl 



41365 41096 
41885 41628 

742753 

911 nex = 

65462 65294 
65761 65543 
66204 66078 

/18408 

1111 nex = 

65761 65543 
66377 66078 

/28021 

1427 nex = 



65761 
66693 



65543 
66078 



/6045 
262 nex = 
92055 91794 
78278 



2297 

94508 
94898 
95549 
96124 
96427 



nex = 

94131 
94766 
95481 
95779 
96207 



738915 

790 nex = 

47037 46932 
47423 47129 
47720 47526 

77769 

556 nex = 

2681 2129 

792155 

411 nex = 

2685 2275 
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>4539353 
len = 
Sngl 
>4539378 
len = 
Sngl 
>4539378 
len = 



/1732 
503 nex = 
53207 53709 
/2917 



370 



nex ■■ 



34714 34349 



1179 nex 



Init 
Intr 
Intr 
Intr 
Term 

>4539378 

len = 



61039 61098 

61178 61285 

61443 61485 

61581 61776 

61866 62217 

/38408 

1035 nex = 



Term 
Init 



72291 71647 
72681 72375 



>4539378 
len = 



Term 
Init 



79437 78787 
80401 79977 



>4539402 
len = 



2875 nex 



Term 
Intr 
Intr 
Intr 
Intr 
Intr 
Intr 
Intr 
Intr 
Init 



27509 
27711 
28151 
28346 
28516 
29461 
29655 
29831 
29988 
30258 



27384 
27643 
28065 
28239 
28426 
29382 
29551 
29737 
29910 
30076 



>4539402 
len = 



/41359 
3130 nex ■■ 



Term 
Intr 
Intr 
Intr 



27509 27214 

27711 27643 

28151 28065 

28346 28239 
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Intr 
Intr 
Intr 
Intr 
Intr 
Init 

>4539402 

len = 

Term 
Init 

>4539402 

len = 

Term 
Intr 
Init 

>4539415 

len = 

Init 
Intr 
Term 

>4539415 
len = 
Sngl 

>4539415 

len = 

Term 
Intr 
Intr 
Intr 
Intr 
Init 

>4539448 
len = 
Sngl 

>4539448 
len = 



28516 28426 

29461 29382 

29655 29551 

29831 29737 

29988 29910 

30343 30076 

/41155 

918 nex = 



35555 
36257 



35468 
36135 



/21228 

146 9 nex = 

35555 35325 
36257 36135 
36793 36352 

/742 

1403 nex = 

26047 26138 
26513 26661 
27042 27449 

/14013 

828 nex = 

30384 30766 

/14312 

2186 nex = 



31374 
31712 
32178 
32659 
32806 
33072 



31324 
31634 
32124 
32576 
32745 
32898 



738976 
401 nex = 
19914 19514 
/13796 
1156 nex = 



Term 19237 18760 
Init 19915 19490 

60 
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>4539448 

Term 
Intr 
intr 
Intr 
Intr 
Init 

>4539448 

len = 

Term 
Intr 
Intr 
Intr 
Intr 
Init 

>4544365 
len = 
Sngl 

>4544365 

len = 

Term 
Intr 
Init 

>4544365 

len == 

Term 
Init 

>4544365 

len = 

Init 
Term 

>4544365 
len = 
Sngl 

>4544381 
len = 



78846 

1419 nex = 

3827 3621 

4092 3925 

4400 4282 

4556 4495 

4723 4647 

5039 4865 

/26831 

1511 nex = 

3827 3616 

4092 3925 

4400 4282 

4556 4495 

4723 4647 

5126 4865 

/32729 

1218 nex = 

13401 14618 

/3839 

2110 nex = 

23021 22565 
23867 23630 
24665 24122 

/32380 

1118 nex = 

44571 44182 
45299 44761 

/28318 

9 70 nex = 

47075 47349 
47442 



79657 
53 3 nex = 
80944 80412 
730607 
1011 nex = 
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Init 
Intr 
Term 

>4544381 

len = 

Sngl 

>4544381 

len = 

Sngl 

>4544381 

len = 

Sngl 

>4544405 

len = 

Init 
Intr 
Intr 
Intr 
Intr 
Term 



Init 
Intr 
Intr 
Intr 
Intr 
Term 

>4544405 
len = 
Sngl 

>4544405 

len = 

Term 
Intr 
Intr 
Intr 
Intr 
Init 



43399 43605 
43651 43747 
43844 44402 

/38904 

207 nex = 

44129 44335 

/6922 

827 nex = 

93744 92918 

/43076 

1855 nex = 

94771 92917 

73366 

12 7 0 nex = 

108883 109010 

109298 109331 

109412 109584 

109671 109722 

109823 109890 

109982 110152 

/21909 

1781 nex = 



25216 
25776 
26206 
26395 
26623 
26747 



25418 
25847 
26284 
26527 
26661 
26996 



/11114 

1439 nex = 

51750 50803 

/40781 

1633 nex = 

80918 80716 

81127 81067 

81666 81595 

81789 81740 

82070 81958 

82348 82159 
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>4544405 
len = 

5 

Sngl 
>4544435 
10 len = 

Sngl 
>4544435 

15 

len = 
Sngl 

20 >4544435 
len = 



25 



35 



45 



Init 
Intr 
Term 



>4544435 
3 0 len = 



Term 
Intr 
Intr 
Intr 
Init 



>4544435 
40 len = 

Sngl 
>4544435 
len = 



Init 
Intr 

50 Intr 
Intr 
Term 

>4544435 



/104918 

1161 nex = 

90572 90694 

/20723 

1413 nex = 

29535 30947 

/111553 

510 nex = 

30533 31042 

/117955 

1450 nex = 

3407 3589 
4304 4489 
4602 4848 

732573 

3571 nex = 

49696 49307 

50351 50080 

50691 50599 

52326 52225 

52877 52427 

738277 

535 nex = 

78881 78813 

726961 

1350 nex = 

92102 92249 

92397 92496 

92587 92644 

92722 93078 

93165 93451 

710312 

2925 nex = 



Term 94666 93767 
Intr 94855 94772 
60 Intr 94981 94933 
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Intr 
Intr 
Init 

5 >4557056 

Len = 

Term 

10 Intr 
Intr 
Init 



>4557061 



15 



len = 
Sngl 

20 >4557061 
len 



2 5 



Term 
Intr 
Init 



>4557061 

3 0 len = 

Term 
Intr 
Init 

35 

>4558521 
len = 

4 0 Sngl 

>4558521 
len = 



45 



Term 
Init 



>4558521 



len = 

Term 
Intr 

55 Intr 
Init 

>4558521 



95346 95304 
95580 95523 
96143 96015 

738982 

1316 nex = 

13849 13681 

14205 14054 

14347 14277 

14637 14454 

77520 

1480 nex = 

23980 24135 

739407 

163 0 nex = 

26056 25532 
26195 26155 
27156 26894 

75435 

1632 nex = 

26056 25532 
26195 26155 
27012 26894 

723664 

898 nex = 

30777 31674 

737480 

760 nex = 

41994 41798 
42557 42246 

727813 

1810 nex = 

60751 60244 

61451 61243 

61676 61576 

62053 61830 

77488 



60 len = 



1414 nex = 



2 
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Term 


83600 


83130 


_ 




Init 


84090 


83849 


_ 


5 


>4558586 


/18981 






len = 


3812 


nex = 


12 




Term 


7302 


6978 


_ 


10 


Intr 


7549 


7499 


_ 




Intr 


7962 


7870 


_ 




Intr 


8232 


8157 


- 




Intr 


8456 


8340 






Intr 


8680 


8616 


- 


15 


Intr 


9112 


8999 






Intr 


9310 


9212 






Intr 


9563 


9472 






Intr 


10014 


9891 


_ 




Intr 


10285 


10118 


_ 


20 


Init 


10789 


10619 


_ 




>4558586 


/18724 






len = 


4210 


nex = 


14 


25 












Term 


11491 


11046 






Intr 


11945 


11761 






Intr 


12171 


12058 






Intr 


12446 


12359 




30 


Intr 


12612 


12540 






Intr 


13246 


13121 






Intr 


13516 


13427 






Intr 


13708 


13598 


- 




Intr 


14034 


13801 




35 


Intr 


14205 


14128 


- 




Intr 


14379 


14308 






Intr 


14551 


14468 


~_ 




Intr 


14778 


14659 






Init 


15254 


15074 




40 












>4558586 


/2070 






len = 


1824 


nex = 


1 


45 


Term 


33890 


33632 


- 




Intr 


34060 


33981 






Intr 


34417 


34262 


- 




Intr 


34645 


34510 






Intr 


34881 


34730 




50 


Intr 


35164 


35075 


_ 




Init 


35455 


35254 


_ 




>4558586 


/157873 




55 


len = 


649 


nex = 


3 




Term 


34881 


34807 






Intr 


35164 


35075 






Init 


35455 


35254 
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10 



>4558586 

len = 

Term 
Intr 
Intr 
Intr 
Intr 
Intr 
Init 



>4558586 
15 len = 

Sngl 
>4558586 

20 

len = 
Sngl 

25 >4558590 
len = 
Init 

30 Intr 
Term 

>4558590 

35 len = 

Init 
Term 

40 >4558656 
len = 
Sngl 

45 

>4558656 
len = 
50 Sngl 
>4558656 
len = 



55 



Term 
Init 



/35814 

1938 nex = 

33890 33575 

34060 33981 

34417 34262 

34645 34510 

34881 34730 

35164 35075 

35512 35254 

/19277 

655 nex = 

77232 77886 

yi06135 

635 nex = 

84401 85035 

/41858 

2113 nex = 

3306 3445 
3564 4886 
5194 5418 

/14128 

1425 nex = 

3948 4886 
5194 5372 

/95747 

859 nex = 

82575 83433 

/25241 

657 nex = 

82776 83432 

/17483 

2449 nex = 

98045 96936 
99384 98603 



>4558674 

60 



/110217 
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len = 
Sngl 
>4558674 
len = 
Sngl 
>4558674 
len = 
Sngl 
>4559319 
len = 



430 nex = 
49557 49135 
/29241 
730 nex = 
50473 51194 
733853 
550 nex = 
50532 51077 
/19749 
2077 nex = 



Init 
Intr 
Intr 
Intr 
Intr 
Term 



51859 52123 

52351 52657 

52735 52822 

52920 53029 

53123 53271 

53559 53620 



>4559319 
len = 



/17752 
2650 nex 



Init 
Intr 
Intr 
Intr 
Term 



65746 
66784 
67501 
67761 
68034 



66206 
65955 
67676 
67919 
68387 



>4559319 
len = 



/19221 
850 nex 



Init 
Term 



75881 76149 
76407 76724 



>4559319 
len = 
Sngl 

>4559319 
len = 



/104159 
523 nex = 
77455 76942 
/41181 
3 37 0 nex = 



Term 
Intr 
Intr 
Intr 
Intr 
Init 



77618 
77969 
78281 
79196 
79602 
80468 



77108 
77856 
78114 
78954 
79519 
79689 
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3250 



nex 



Init 
Intr 
Intr 
Intr 
Intr 
Intr 
Intr 
Intr 
Intr 
Intr 
Intr 
Term 

>4559344 

len = 

Term 
Init 

>4559344 

len = 

Term 
Intr 
Intr 
Intr 
Init 

>4559344 

len = 

Init 
Intr 
Intr 
Intr 
Term 

>4559344 

len = 

Sngl 

>4559344 

len = 

Sngl 

>4559344 



84481 
85229 
85302 
85490 
85652 
85920 
86232 
86584 
86992 
87283 
87461 
87662 



84692 
85264 
85358 
85573 
85723 
85991 
86272 
86650 
87183 
87372 
87568 
87727 



/17434 

1227 nex = 

99351 98868 
100094 99455 

/30071 

1657 nex = 

36166 35586 

36370 36254 

36762 36454 

37029 36967 

37242 37124 

/10798 

1189 nex = 

45685 45867 

45942 46037 

46237 46359 

46569 46608 

46704 46873 

/1916 

490 nex = 

57577 57090 

76722 

409 nex = 

57584 57176 

/10261 



60 len = 



522 nex = 



1 
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Sngl 

>4559344 

len = 

Init 
Intr 
Term 

>4559344 

len = 

Init 
Intr 
Intr 
Intr 
Intr 
Intr 
Term 

>4559344 

len = 

Sngl 

>4559344 

len = 

Sngl 

>4559344 

len = 

Term 
Intr 
Init 

>4559375 

len = 

Sngl 

>4559375 

len = 

Sngl 

>4559375 

len = 

Sngl 



57598 57077 

/97914 

1291 nex = 

78733 78958 
79545 79694 
79797 80023 

/14570 

22 80 nex = 

81776 82132 

82307 82649 

82774 82992 

83092 83151 

83231 83386 

83500 83593 

83750 84055 

/122569 

214 nex = 

83842 84055 

/36270 

13 7 7 nex = 

91284 90056 

729774 

1937 nex = 

91284 90056 
91468 91366 
91992 91560 

/102981 

568 nex = 

19716 20275 

/94104 

516 nex = 

21573 22085 

/12474 

737 nex = 

26746 26629 
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1760 nex 



Init 
Intr 
Intr 
Intr 
Intr 
Intr 
Term 



38581 
38834 
39074 
39265 
39547 
39807 
40251 



38611 
39022 
39179 
39450 
39712 
40123 
40340 



Term 
Intr 
Intr 
Init 



1426 

16701 
17094 
17731 
17857 



16432 
16777 
17187 
17801 



17 6 0 nex 



Term 
Intr 
Intr 
Intr 
Intr 
Init 

>4567193 

len = 

Term 
Intr 
Intr 
Intr 
Intr 
Intr 
Init 



28035 
28415 
28602 
28796 
29087 
29337 



27578 
28113 
28501 
28685 
29011 
29173 



/206273 

2060 nex = 

30185 29858 

30460 30268 

30657 30541 

31111 30896 

31336 31225 

31570 31506 

31737 31656 



>4567193 
len = 
Sngl 

>4567193 
len = 



/15405 
192 nex = 
39059 39250 
/30185 
1632 nex = 



Term 
Intr 
Intr 
Init 



41869 41427 

42547 42403 

42721 42661 

43058 42799 
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>4567193 
len = 



Init 
Term 



65231 65374 
65556 66158 



>4567193 
len = 



/153017 
616 nex 



Init 
Term 



65262 65374 
65556 65877 



>4567237 
len = 



1334 



Init 
Intr 
Intr 
Intr 
Term 



37681 
38008 
38602 
38750 
38969 



37817 
38054 
38657 
38866 
39006 



>4567237 
len = 



2314 



nex ■ 



Init 
Intr 
Intr 
Intr 
Intr 
Intr 
Intr 
Term 



52101 
52608 
52766 
52909 
53119 
53328 
53682 
54076 



52260 
52692 
52834 
53021 
53230 
53471 
53856 
54414 



>4567237 
len = 



/22703 
1450 nex = 



Term 
Intr 
Init 



56329 56148 
56617 56415 
57116 56959 



>4567237 
len = 



/33370 
68 5 nex 



Term 
Intr 
Init 

>4567237 



76509 76290 
76701 76598 
76968 76910 

/11284 

693 nex = 



Term 76509 76285 
Intr 76701 76598 
60 Init 76968 76910 
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1441 

>4567259 /33 

len = 1944 nex = 6 

5 

Term 678 284 - 0 

Intr 980 761 - 0 

Intr 1195 1063 - 0 

Intr 1445 1328 - 0 

10 intr 1621 1527 - 0 

Init 2227 2024 - 0 

>4567259 /38112 

15 len = 1572 nex = 5 

Term 26086 25917 - 0 

Intr 26263 26173 - 0 

Intr 26465 26356 - 0 

20 Intr 26929 26660 - 0 

0 Init 27488 27013 - 0 

>4567259 /20834 

ip 25 len = 3815 nex = 14 

W Init 29608 30133 + 0 

L!;; Intr 30762 30856 + 0 

^' Intr 30987 31095 + 0 

30 Intr 31188 31234 + 0 

U Intr 31412 31467 + 0 

01 Intr 31629 31767 + 0 
Intr 31884 32021 + 0 

01 Intr 32100 32168 + 0 

fl 35 Intr 32254 32364 + 0 

Intr 32467 32595 + 0 

Intr 32688 32765 + 0 

Intr 32854 32925 + 0 

Intr 33026 33088 + 0 

40 Term 33177 33422 + 0 

>4567259 /121118 

len = 2034 nex = 9 

45 

Init 31629 31767 + 0 

Intr 31884 32021 + 0 

Intr 32100 32168 + 0 

Intr 32254 32364 + 0 

50 Intr 32467 32595 + 0 

Intr 32688 32765 + 0 

Intr 32854 32925 + 0 

Intr 33026 33088 + 0 

Term 33177 33474 + 0 

55 

>4567259 /21730 

len = 1774 nex = 6 

60 Term 2963 2647 - 0 
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Intr 
Intr 
Intr 
Intr 
Init 



3263 3044 

3478 3346 

3746 3629 

3925 3831 

4420 4216 



20 >4567259 



/34470 



Init 


5119 


5168 


+ 


0 


Intr 


5278 


5355 


+ 


0 


Intr 


5522 


5546 


+ 


0 


Intr 


5632 


5664 


+ 


0 


Intr 


5971 


6101 


+ 


0 


Intr 


6291 


6511 


+ 


0 


Intr 


6606 


6853 


+ 


0 


Term 


6954 


7195 


+ 


0 



len 



2209 



nex ■ 



Init 
Intr 
Intr 
Intr 
Intr 
Intr 
Intr 
Intr 
Term 



46682 46836 

47131 47236 

47341 47382 

47532 47597 

47686 47802 

47939 48214 

48303 48350 

48427 48548 

48699 48890 



>4567259 
len = 



736598 



2212 nex 



Init 
Intr 
Intr 
Intr 
Intr 
Intr 
Intr 
Intr 
Intr 
Intr 
Term 



46682 
46924 
47131 
47341 
47532 
47686 
47939 
48112 
48303 
48427 
48699 



46836 
47015 
47236 
47382 
47597 
47802 
48013 
48214 
48350 
48548 
48893 



>4567259 



/109948 
47 5 nex = 



Init 
Term 



48460 48548 
48699 48934 



len = 

60 



1185 nex = 



1 
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Sngl 

>4567259 

5 len = 

Term 
Intr 
Init 

10 

>4567259 
len = 
15 Sngl 

>4567259 
len = 

20 

Sngl 

>4567300 

2 5 len = 

Init 
Intr 
Term 

30 

>4567300 

len = 

35 Init 
Intr 
Term 

>4567300 

40 

len = 

Term 
Intr 

4 5 Intr 

Init 

>4567300 

5 0 len = 

Sngl 
>4572664 
len = 



55 



52746 53930 

/21453 

1410 nex = 

69518 68949 
69807 69688 
70358 69891 

733656 

4 63 nex = 

9884 10346 

/156111 

322 nex = 

9910 10231 

/125386 

910 nex = 

13167 13335 
13426 13542 
13632 14051 

/101253 

859 nex = 

33481 33676 
33813 33938 
34031 34339 

/123234 

13 30 nex = 

34776 34516 

35061 34867 

35283 35188 

35576 35463 

/24221 

393 nex = 

8449 8841 

/2867 

1510 nex = 



Term 55509 55260 
Intr 55693 55605 
60 Intr 55882 55783 
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Init 

>4572664 

len = 

Sngl 

>4572664 

len = 

Sngl 

>4572664 

len = 

Init 
Intr 
Term 

>4572664 

len = 

Term 
Init 

>4572664 

len = 

Init 
Term 

>4572664 
len = 
Sngl 

>4580365 

len = 

Term 
Intr 
Intr 
Init 

>4580365 
len = 
Sngl 

>4580365 
len = 



56077 56005 

/34807 

1418 nex = 

59430 60847 

/113985 

17 3 nex = 

66412 66584 

/11953 

1510 nex = 

66481 67144 
67636 67764 
67855 67987 

/34681 

1379 nex = 

73338 72795 
74173 73545 

/37401 

1316 nex = 

76483 76979 
77234 77798 

/36719 
1231 nex = 
79365 78135 

727596 

136 5 nex = 

99273 99000 

99825 99644 

100012 99954 

100364 100196 

/27081 

492 nex = 

43711 43220 

/39571 

2540 nex = 
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Init 
Intr 
Intr 
Intr 
Intr 
Intr 
Intr 
Intr 
Term 



60690 
61125 
61348 
61645 
61925 
62147 
62438 
62686 
62943 



60861 
61240 
61407 
61830 
62029 
62243 
62545 
62821 
63229 



>4580365 
len = 

15 

Init 
Intr 
Intr 
Term 

20 >4580365 



/108216 

1090 nex = 

62147 62243 
62438 62545 
62686 62821 
62943 63229 
/158765 



len = 

Init 
Intr 
Intr 
Term 



2230 nex = 

96491 97832 

97943 98008 

98083 98148 

98215 98259 



/3900 



Init 
Intr 

3 5 Intr 
Intr 
Term 



1249 nex = 

97546 97832 

97943 98008 

98083 98148 

98215 98259 

98386 98794 



>4580454 



/36815 





len = 


2248 


nex = 


6 




Term 


23978 


23775 






Intr 


24212 


24072 




45 


Intr 


24520 


24304 






Intr 


24794 


24625 






Intr 


25442 


25197 






Init 


26022 


25523 




50 


>4580454 


/12246 






len = 


1961 


nex = 


7 




Init 


26312 


26626 




55 


Intr 


26975 


27027 


+ 




Intr 


27104 


27213 


+ 




Intr 


27296 


27463 


+ 




Intr 


27553 


27650 


+ 




Intr 


27792 


27918 


+ 


60 


Term 


28033 


28272 
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>4580454 
len = 



/123469 
2112 nex 



Init 


29262 


29569 


+ 


0 


Intr 


30006 


30058 


+ 


0 


Intr 


30153 


30262 


+ 


0 


Intr 


30351 


30518 


+ 


0 


Intr 


30608 


30705 


+ 


0 


Intr 


30832 


30958 


+ 


0 


Term 


31073 


31373 


+ 


0 



>4580732 
len = 
Sngl 

>4580732 

len = 

Term 
Intr 
Init 

>4580744 

len = 



Term 
Intr 
Intr 
Init 

>4580744 

len = 

Term 
Intr 
Intr 
Intr 
Init 

>4580744 

len = 

Sngl 

>4580744 

len = 

Sngl 



/43073 

1642 nex = 

102136 103777 

/1069 

1112 nex = 

42616 42417 
43115 43025 
43528 43375 

/34680 

2025 nex = 

15210 14726 

15766 15298 

16116 15998 

16750 16441 

/1294 

2050 nex = 

22434 22008 

22792 22540 

23073 22873 

23265 23147 

24057 23791 

/6124 

251 nex = 

31321 31071 

/36505 

3231 nex = 

32321 31071 



>4580744 

60 



/97415 
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len = 


854 


nex = 


1 




Sngl 


40155 


39760 




5 


>4580744 


/117519 






len = 


1292 


nex = 


1 




1 

ng 


41250 


42541 


+ 


10 












>4580744 


/783 






len 


1575 


nex = 


5 


15 


Term 


44523 


44000 






Intr 


44831 


44655 


- 




Intr 


45013 


44928 








45231 


45112 






Init 
ni 


45559 


45529 




2 0 












>4580744 


/37712 






1 n - 
en 


2875 


nex = 


12 




Term 


44523 


44251 






Intr 


44831 


44655 








45013 


44928 






Intr 


45231 


45112 






Intr 


45559 


45529 


- 


30 


Intr 


45722 


45647 


: 




Intr 


45995 


45908 






Intr 


46153 


46092 






Intr 


46314 


46238 


_ 




Intr 


46521 


46419 


- 


3 5 


Intr 


46677 


46616 






Init 


47125 


47027 






>45o0744 


737965 




_ 


en 


3581 




14 




Term 


55823 


55611 






Intr 


56063 


55914 






Intr 


56251 


56154 






Intr 


56471 


56331 








56734 


56585 






Intr 


56994 


56818 






Intr 


57262 


57082 






Intr 


57604 


57559 




50 


Intr 


57984 


57921 






Intr 


58092 


58070 








58311 


58239 






Intr 


58627 


58496 






Intr 


58799 


58713 




55 


Init 


59191 


59053 






>4580744 


/40575 






len = 


3683 


nex = 


14 
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Term 
Intr 
Intr 
Intr 
5 Intr 
Intr 
Intr 
Intr 
Intr 

10 Intr 
Intr 
Intr 
Intr 
Init 

15 

>4580744 

len = 

2 0 Init 
Intr 
Intr 
Intr 
Intr 

2 5 Term 
>4580744 
len = 

30 

Init 
Term 



Term 
Init 

40 

>4580744 

len = 

45 Init 
Intr 
Intr 
Intr 
Intr 

5 0 Term 
>4580745 
len = 

55 

Sngl 
>4580745 
60 len = 



55823 
56063 
56251 
56471 
56734 
56994 
57262 
57604 
57984 
58092 
58311 
58627 
58799 
59304 



55622 
55914 
56154 
56331 
56585 
56818 
57082 
57559 
57921 
58070 
58239 
58496 
58713 
59053 



728624 

2278 nex = 

60438 60505 

60760 60832 

61116 61231 

61311 61403 

61511 61610 

61781 62029 

794743 

463 nex = 

67354 67484 
67496 67816 

739230 

92 0 nex = 

74699 74303 
75222 74942 

718908 

2052 nex = 

75765 76184 

76412 76510 

76599 76760 

76852 76973 

77156 77275 

77360 77816 

72757 

777 nex = 

11301 12077 

732177 

757 nex = 
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Sngl 
>4580745 
len = 
Sngl 
>4580745 
len = 



11324 12080 
/159018 
694 nex = 
11389 12082 
/3297 
2 687 nex = 



Term 
Intr 
Intr 
Intr 
Intr 
Intr 
Intr 
Intr 
Intr 
Init 



14071 
14202 
14379 
14542 
14703 
14942 
15083 
15381 
15714 
16245 



13559 
14142 
14272 
14483 
14625 
14872 
15027 
15201 
15550 
16004 



>4580745 
len = 



/12622 
2315 nex 



Init 
Intr 
Intr 
Intr 
Intr 
Term 



59279 
59843 
60132 
60370 
60831 
61005 



59398 
59965 
60197 
60435 
60911 
61589 



>4580745 

len = 

Term 
Intr 
Intr 
Intr 
Intr 
Init 

>4581084 

len = 



/10117 

2193 nex = 

7149 6887 

7300 7240 

7562 7493 

7945 7813 

8854 8764 

9079 8961 

732373 

1583 nex = 



Term 
Intr 
Intr 
Init 



13458 13179 

13805 13635 

14191 14006 

14761 14500 



/3173 
2123 nex ■■ 



60 



Init 



14954 15412 



0 
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Intr 
Intr 
Intr 
Term 



15767 16020 

16168 16291 

16405 16509 

16791 17076 



>4581084 
len = 



/34830 
1601 nex 



Term 
Intr 
Init 



22349 21931 
23000 22548 
23531 23196 



Term 
Init 



len ■ 



40647 39196 
41020 40887 



/41311 
2171 nex 



Init 
Intr 
Intr 
Intr 
Intr 
Intr 
Term 



41924 
42756 
42930 
43048 
43184 
43366 
43750 



42481 
42849 
42963 
43102 
43267 
43653 
44094 



Term 
Intr 
Intr 
Intr 
Intr 
Intr 
Intr 
Init 



3055 
3229 
3411 
3664 
3843 
4154 
4317 
4792 



2967 
3167 
3322 
3604 
3749 
4065 
4231 
4411 



>4581084 
len = 



2451 



Term 
Intr 
Intr 
Intr 
Intr 
Intr 
Init 



3055 
3229 
3411 
3664 
3843 
4154 
4317 



2957 
3167 
3322 
3604 
3749 
4065 
4231 



6 0 len = 



12 89 nex = 



3 
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60581 60181 




0 




Intr 


61120 61028 




0 




Init 


61458 61210 




0 


5 












>4581084 


/38429 








len = 


2273 nex = 


6 




10 


Term 


60581 60221 


- 


0 






61120 61028 








Intr 


61458 61210 








Intr 


61709 61557 




0 




Intr 


62108 61784 


- 


0 


15 


Init 


62493 62191 


- 


0 




>4 58 1084 


75677 








len = 


943 nex = 


2 




20 












Init 


72815 73073 




0 




Term 


73160 73757 








>4581103 


736296 






25 












len = 


1853 nex = 


4 






Init 


105971 106313 


+ 


0 






1UDD410 lUOooO 






30 


Intr 


107212 107352 


+ 


0 




Term 


107448 107823 


+ 


0 




>4581103 


7119868 






35 


len = 


63 8 nex = 


2 






Term 


25527 25372 








Init 


26009 25647 


- 


0 


40 


>4581103 


726940 








len = 


107 6 nex = 


3 






Init 


49009 49124 


+ 


0 


45 


Intr 


49202 49461 


+ 


0 




Term 


49704 50084 








>4581 103 


723995 






50 


len = 


755 nex = 


1 






Sngl 


5195 5949 


+ 


0 




>4581103 


731917 






55 












len = 


2086 nex = 


3 





Init 52571 52873 
Intr 54100 54217 
60 Term 54298 54656 
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>4581103 
len = 



1056 nex 



Init 
Intr 
Intr 
Term 



72192 72336 

72438 72565 

72751 72855 

72981 73247 



>4581103 
len = 



/40637 



2148 nex 



Term 
Intr 
Intr 
Intr 
Intr 
Init 

>4581103 

len = 



86242 85852 

86439 86344 

86717 86593 

87013 86870 

87427 87104 

87999 87653 

/19417 

1281 nex = 



Term 
Intr 
Intr 
Init 

>4581103 

len = 



88901 88344 

89054 88975 

89191 89149 

89329 89270 

723735 

1579 nex = 



Init 
Term 



90125 90516 
90763 91703 



>4581138 
len = 



207 9 nex ■■ 



Init 
Intr 
Intr 
Intr 
Intr 
Term 

>4581138 

len = 



17050 17171 

17298 17403 

17518 17646 

17927 18071 

18203 18760 

18845 19128 

75234 

812 nex = 



Init 
Term 



>4581138 
len = 



18317 18760 
18845 19128 



/13024 
16 80 nex 



60 



Term 



46797 45819 



0 
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Intr 
Init 

>4581138 

len = 

Sngl 

>4581138 

len = 

Sngl 

>4581138 

len = 



47110 46883 
47498 47408 

/18705 

550 nex = 

52038 51498 

739339 

291 nex = 

70179 69889 



Term 
Intr 
Init 



70209 69891 
70349 70293 
70536 70438 



>45eil38 
len = 



2972 



nex ■■ 



Term 
Intr 
Intr 
Intr 
Intr 
Intr 
Intr 
Intr 
Intr 
Intr 
Intr 
Init 



70209 
70349 
70629 
70921 
71094 
71319 
71648 
71860 
72035 
72187 
72413 
72911 



69940 
70293 
70438 
70727 
71017 
71182 
71466 
71780 
71949 
72140 
72285 
72769 



/12298 
3528 nex 



Init 
Intr 
Intr 
Intr 
Intr 
Intr 
Intr 
Intr 
Intr 
Intr 
Intr 
Intr 
Intr 
Intr 
Intr 
Term 



88301 
88550 
88730 
88922 
89176 
89364 
89613 
89822 
89997 
90230 
90498 
90705 
90897 
91053 
91480 
91677 



88493 
88646 
88827 
89077 
89253 
89495 
89693 
89893 
90086 
90374 
90586 
90800 
90964 
91152 
91539 
91828 
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1454 

>4581138 /20959 



len = 2279 nex = 7 

5 

Term 92547 92107 - 0 

Intr 92826 92611 - 0 

Intr 93078 92954 - 0 

Intr 93363 93174 - 0 

10 Intr 93544 93450 - 0 

Intr 93830 93620 - 0 

Init 94385 94134 - 0 

>4581161 76542 

15 

len = 974 nex = 2 

Term 103655 103129 - 0 

Init 104102 104Q14 - 0 

2 0 

y >4581161 74401 

len = 944 nex = 2 

U1 25 Term 103655 103178 - 0 

bj Init 104121 104014 - 0 

RJ 

0=1 >4581161 73929 

~ 30 len = 259 4 nex = 5 

Term 101826 101547 - 0 

Intr 102280 102202 - 0 

Intr 102758 102630 - 0 

D 35 Intr 103655 103281 - 0 

p Init 104140 104014 - 0 

>4581161 7207629 

40 len = 1078 nex = 1 

Sngl 104140 104014 - 0 

>4581161 7153194 

45 

len = 1076 nex = 2 

Term 103655 103065 - 0 

Init 104140 104014 - 0 

50 

>4581161 7113443 

len = 1014 nex = 1 

55 Sngl 104140 104014 - 0 



>4581161 7105261 



len = 

60 



1012 nex = 
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Term 103655 103129 

Init 104140 104014 

>4581161 /23214 

len = 970 nex = 

Term 103655 103172 

Init 104140 104014 

>4581161 /92991 

len = 970 nex = 

Term 103655 103178 

Init 104140 104014 

>4581161 /109952 

len = 951 nex = 

Term 103655 103190 

Init 104140 104014 

>4581161 /118540 

len = 597 nex = 

Term 103655 103544 

Init 104140 104014 

>4581161 /18215 

len = 2507 nex = 

Sngl 104144 104014 

>4581161 724845 

len = 970 nex = 

Term 103655 103177 

Init 104144 104014 

>4581161 724667 

len = 970 nex = 

Term 103655 103178 

Init 104144 104014 

>4581161 73416 

len = 1815 nex = 

Init 12372 12490 

Intr 13293 13348 

Intr 13421 13473 

Intr 13699 13817 

Term 13921 14186 
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>4581161 

len = 

Init 
Term 

>4581161 

len = 

Init 
Term 

>4581161 

len = 

Sngl 

>4581161 

len = 

Sngl 

>4581161 

len = 

Term 
Init 

>4581161 

len = 

Init 

>4581161 

len = 

Term 
Init 

>4581161 

len = 

Term 
Init 

>4581161 

len = 

Term 



/9243 

9 43 nex = 

12408 12490 
13293 13350 

/1618 

698 nex = 

15445 15838 
15921 16142 

/6509 

395 nex = 

80928 80534 
/12929 

396 nex = 

80929 80534 
/24190 

933 nex = 

81003 80534 

81466 81341 

739985 

934 nex = 

81003 80534 

81467 81341 

/20104 

911 nex = 

81003 80557 
81467 81341 

/109026 

753 nex = 

81003 80715 
81467 81341 

734326 

938 nex = 

81003 80534 
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Init 
>4581161 
len = 



81471 81341 
/12459 
915 nex = 



Term 
Init 



81003 80557 
81471 81341 



>4581161 
len = 

>4581161 

len = 

Term 
Init 



/103273 
910 nex = 

/102088 

671 nex = 

81003 80801 
81471 81341 



>4582411 
len = 



2810 nex 



Init 
Intr 
Intr 
Intr 
Intr 
Intr 
Term 

>4582411 

len = 

Init 
Intr 
Intr 
Intr 
Intr 
Intr 



45451 
45798 
46005 
46212 
47033 
47380 
47685 



45474 
45903 
46073 
46705 
47294 
47588 
47897 



/21223 
2816 nex 



45451 
45798 
46005 
46212 
47033 
47380 
47685 



45474 
45903 
46073 
46705 
47294 
47588 
47903 



>4582411 
len = 



1719 



Init 
Intr 
Intr 
Intr 
Term 



48106 
48506 
49001 
49174 
49615 



48233 
48720 
49081 
49521 
49824 



>4582428 
len = 



1111 nex 



Init 14118 14404 
60 Term 14774 15228 
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>4582428 
len = 



/16737 
1407 nex 



Init 


18518 


18583 


+ 


0 


Intr 


18928 


19009 


+ 


0 


Intr 


19112 


19183 


+ 


0 


Intr 


19501 


19553 


+ 


0 


Term 


19701 


19924 


+ 


0 



>4582428 

len = 

Init 
Intr 
Intr 
Term 

>4582437 

len = 

Sngl 

>4582437 

len = 

Sngl 

>4582444 

len = 

Init 
Term 

>45B2444 

len = 

Sngl 

>4582444 

len = 

Sngl 

>4582444 

len = 

Term 
Intr 
Intr 
Intr 
Init 



/97480 

1161 nex = 

18928 19009 

19112 19183 

19501 19553 

19701 19911 

/2900 

397 nex = 

19712 19316 

799899 

6 80 nex = 

7826 8505 

/103070 

790 nex = 

2508 3031 
3115 3290 

/37180 

1440 nex = 

25314 26753 

/123564 

14 5 0 nex = 

25977 26038 

/25550 

167 5 nex = 

27247 27080 

27553 27435 

27722 27661 

28224 28148 

28606 28475 
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>4582444 
len = 

5 

Sngl 

>4582444 

10 len = 

Init 
Intr 
intr 

15 Intr 
Term 

>4582444 

2 0 len = 

Init 
Term 

25 >4582444 
len = 
Term 

30 Init 
>4582444 
len = 

35 

Sngl 

>4582444 

4 0 len = 

Term 
Intr 
Init 



45 



>4582444 
len = 
50 Sngl 
>4584339 
len = 

55 

Term 
Intr 
Intr 
Intr 

6 0 Intr 



/19236 

651 nex = 

34810 35460 

79568 

1311 nex = 

39087 39157 

39241 39388 

39469 39640 

39922 40108 

40198 40397 

/15190 

718 nex = 

39945 40108 
40198 40662 

/15416 

1431 nex = 

62850 62359 
63789 63368 

774 

1311 nex = 

71021 70539 

725159 

1106 nex = 

75373 75084 
75585 75448 
76189 76007 

794470 

430 nex = 

78400 77979 

713257 

2291 nex = 

19077 18435 

19277 19154 

19501 19359 

19724 19589 

20156 19825 
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Init 
>4584339 
5 len = 

Sngl 
>4584339 

10 

len = 



15 



45 



Init 
Term 



>4584351 
len = 

2 0 Sngl 

>4584351 
len = 

25 

Sngl 
>4584351 

3 0 len = 

Sngl 
>4584351 

35 

len = 
Sngl 

40 >4584351 
len = 



Term 
Intr 
Intr 
Intr 
Init 



50 >4584387 
len = 
Sngl 

55 

>4584387 
len = 
6 0 Sngl 



20725 20474 

/641 

1150 nex = 

26683 25535 

/552 

1065 nex = 

28913 29192 
29589 29977 

/17506 

490 nex = 

30155 29673 

/41808 

521 nex = 

30170 30061 

/143435 

471 nex = 

30170 29700 

/14272 

575 nex = 

30249 29675 

/91878 

1832 nex = 

30771 30505 

31297 31037 

31481 31377 

31721 31591 

32336 32137 

/7818 

1390 nex = 

19033 19118 

/12275 

490 nex = 

31615 31127 
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5325 



Init 
Intr 
Intr 
Intr 
Intr 
Intr 
Intr 
Intr 
Intr 
Intr 
Intr 
Intr 
Intr 
Intr 
Intr 
Intr 
Intr 
Intr 
Intr 
Term 



31825 
32018 
32301 
32573 
32903 
33174 
33394 
33795 
33973 
34178 
34678 
34996 
35202 
35374 
35706 
35900 
36106 
36275 
36584 
36857 



31912 
32165 
32493 
32662 
32965 
33243 
33458 
33849 
34094 
34293 
34747 
35100 
35291 
35522 
35817 
35983 
36186 
36499 
36710 
37134 



>4584387 



2598 nex - 



Init 
Intr 
Intr 
Intr 
Intr 
Intr 
Term 



38193 
38429 
38670 
38917 
39199 
39439 
39625 



38274 
38504 
38833 
39067 
39321 
39534 
40136 



>4584519 
len = 



Term 


97811 


97456 




Intr 


98170 


97906 




Intr 


98475 


98309 




Intr 


98711 


98568 




Intr 


99028 


98919 




Intr 


99221 


99113 




Intr 


99408 


99319 




Intr 


99531 


99493 




Intr 


99800 


99606 




Init 


99974 


99924 





/21149 
1545 nex ■■ 



Term 100825 100655 
Intr 101015 100926 
60 Intr 101223 101105 
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Intr 
Intr 
Intr 
Init 

>4584519 

len = 

Init 
Term 



101620 101506 

101832 101736 

101991 101928 

102199 102148 

/6308 



28246 28587 
28683 29134 



>4584519 

len = 

Term 
Intr 
Intr 
Intr 
Intr 
Init 

>4584519 
len = 
Sngl 

>4584531 

len = 

Init 
Intr 
Term 

>4584531 

len = 

Term 
Intr 
Intr 
Init 

>4584531 
len = 
Sngl 

>4584531 

len = 

Term 
Intr 
Intr 
Init 



/28170 
1464 nex 



34134 
34343 
34584 
34767 
34979 
35201 



34075 
34218 
34431 
34659 
34916 
35079 



/12980 

612 nex = 

38892 38281 

/1259 

18 81 nex = 

2041 2614 
3185 3256 
3662 3921 

/10016 

1471 nex = 



41067 
41273 
41510 
41872 



40537 
41234 
41378 
41792 



/42200 

430 nex = 

56875 56451 

/38181 

3698 nex = 

4219 3881 

4842 4704 

5250 5029 

7571 7406 
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>4584841 
len = 
Sngl 

>4584841 

len = 

Term 
Init 

>4584841 
len = 
Sngl 

>4584841 

len = 

Term 
Init 

>4584841 

len = 

Term 
Init 

>4584841 

len = 



/37069 
13 01 nex = 
1552 252 

/16143 

1050 nex = 

75412 74936 
75985 75751 

/159247 

236 nex = 

75985 75750 

/6145 

910 nex = 

75412 75090 
75994 75751 

/123475 

732 nex = 

75412 75265 
75996 75751 

79874 

1765 nex = 



Init 


81507 


81619 




0 


Intr 


81801 


81961 


+ 


0 


Intr 


82057 


82137 


+ 


0 


Intr 


82622 


82745 


+ 


0 


Term 


82877 


83265 




0 



>4585890 
len = 
Sngl 

>4585890 

len = 

Term 
Intr 
Intr 
Intr 
Intr 
Intr 



/120166 

653 nex = 

12573 11926 

/13800 

2230 nex = 

19173 18816 

19409 19311 

19822 19718 

20057 19911 

20227 20141 

20363 20307 
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Init 21042 20494 
>4585890 723237 
len = 1936 nex = 



Term 
Intr 
Intr 
Intr 
Intr 
Intr 
Intr 
Init 



21526 
21725 
21895 
22074 
22419 
22565 
22733 
23268 



21333 
21662 
21830 
22013 
22332 
22505 
22650 
22886 



Init 
Intr 
Intr 
Intr 
Intr 
Intr 
Intr 
Intr 
Term 

>4585890 
len = 
Sngl 

>4585890 

len = 

Init 
Intr 
Intr 
Intr 
Intr 
Intr 
Term 

>4585891 

len = 

Term 
Intr 
Init 

>4585891 

len = 

Sngl 



2530 

27106 
27272 
27474 
27682 
27891 
28113 
28292 
28459 
28596 



nex = 

27159 
27335 
27530 
27788 
27929 
28201 
28370 
28506 
28976 



76787 

10 7 0 nex = 

35922 34853 

76066 

1954 nex = 

63417 63643 

63802 63857 

63959 64073 

64155 64265 

64360 64548 

64929 64985 

65108 65370 

733396 

1691 nex = 

13026 12969 
13198 13099 
13882 13459 

799687 

1230 nex = 

26998 27650 
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>4585891 
len = 



/38876 
2154 nex 



Term 


57147 


55769 




0 


Intr 


57348 


57237 




0 


Intr 


57473 


57439 




0 


Intr 


57784 


57707 




0 


Intr 


58148 


58008 




0 


Intr 


58290 


58225 




0 


Intr 


58533 


58389 




0 


Init 


58922 


58608 




0 



>4585891 
len = 
Sngl 

>4585891 

len = 

Init 
Term 

>4585891 

len = 

Term 
Intr 
Intr 
Intr 
Intr 
Intr 
Intr 
Intr 
Intr 
Init 

>4585891 

len = 

Sngl 

>4585891 

len = 

Sngl 

>458589I 

len = 



/108157 
407 nex = 
6299 5893 

/113049 

1133 nex = 

67122 67207 
67928 68110 

/3006 

2536 nex = 

68750 68545 

68912 68844 

69081 69011 

69306 69228 

69556 69392 

69684 69640 

69849 69775 

70301 70125 

70649 70386 

71080 70870 

722546 

1469 nex = 

75226 73758 

/16254 

1456 nex = 

75226 73771 

739762 

1633 nex = 



Term 6306 5906 
60 Intr 6905 6416 
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Init 
>4585891 
len = 



7538 7127 

738737 
1846 nex = 



Term 
Intr 
Init 



87122 86839 
87408 87220 
87854 87583 



>45B5896 
len = 



792372 
13 9 9 nex 



Term 
Init 



7865 
8808 



7410 
8691 



>4585896 
len = 



Term 
Init 



7865 
8806 



7424 
8691 



>4585906 

Sngl 
>4585918 
len = 



721936 
206 nex = 
56292 56087 
734035 
1336 nex = 



Init 
Intr 
Intr 
Term 



4050 
4394 
4559 
5076 



4159 
4474 
4602 
5385 



>4585952 
len = 



716644 
1586 nex 



Term 
Intr 
Intr 
Intr 
Init 



29418 
29679 
29904 
30259 
30447 



29073 
29554 
29810 
30174 
30407 



>4585952 
len = 



725421 
1909 nex 



Term 
Intr 
Intr 
Intr 
Intr 
Init 



29418 
29679 
29904 
30259 
30508 
30981 



29073 
29554 
29810 
30174 
30407 
30796 
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>4585952 

len = 

Term 
Init 

>4585952 

len = 

Sngl 

>4585952 

len = 

Sngl 

>4585952 

len = 

Sngl 

>4585952 

len = 

Term 
Intr 
Intr 
Intr 
Intr 
Intr 
Intr 
Intr 
Intr 
Intr 
Intr 
Init 

>4585952 

len = 

Term 
Intr 
Intr 
Intr 
Intr 
Intr 
Init 

>4585952 

len = 



7857 

474 nex = 

33620 33385 
33853 33807 

/23092 

568 nex = 

39340 39907 

/205610 

550 nex = 

39349 39891 

/15789 

430 nex = 

39378 39802 

/11306 

313 0 nex = 

45205 44960 

45443 45287 

45633 45542 

45776 45712 

45925 45860 

46136 46005 

46398 46225 

46643 46485 

46814 46727 

46979 46898 

47184 47067 

47333 47268 

/20491 

2 610 nex = 

60891 60337 

61479 61333 

61646 61593 

61921 61751 

62418 62209 

62609 62523 

62935 62696 

/1496 

610 nex = 



Sngl 91280 91884 

60 



0 
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>4586024 

len = 

Term 
Init 

>4586024 

len = 

Init 
Term 

>4586024 

len = 

Term 
Init 

>4586024 

len = 

Init 
intr 
Intr 
Intr 
Term 

>4586024 

len = 

Init 
Intr 
Intr 
Intr 
Term 

>4586024 

len = 

Init 

>4586024 

len = 

Init 
Intr 
Intr 
Intr 
Term 



/207026 
970 nex 



24258 24031 
24993 24849 



/39169 
113 8 nex ^ 



25959 26001 
26086 26319 



724343 
768 nex 



32828 32380 
33147 32916 



2010 nex = 

35746 35924 

36211 36327 

36439 36609 

36707 36919 

37053 37444 

/6176 

16 90 nex = 

35746 35924 

36211 36327 

36439 36609 

36707 36919 

37053 37417 

/25528 

1217 nex = 

42248 41624 

42832 42420 

/15198 

1372 nex = 

7326 7385 

7539 7640 

7863 7992 

8093 8205 

8475 8697 



>4586065 

60 



/41879 
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len = 
Sngl 
>4586098 
len = 



2814 



Term 
Intr 
Intr 
Intr 
Intr 
Intr 
Intr 
Intr 
Intr 
Init 



8727 
8910 
9254 
9412 
9590 
9773 
9978 
10163 
10363 
11133 



8320 
8813 
9144 
9361 
9514 
9717 
9878 
10067 
10251 
10862 



>4586098 

len = 

Term 
Intr 
Intr 
Init 

>4586098 
len = 
Sngl 

>4586098 

len = 

Term 
Intr 
Intr 
Intr 
Init 



1731 



nex ■ 



27031 26717 

27342 27125 

27658 27422 

28447 27982 

/45 

1285 nex = 
48639 49307 
/32461 



61744 61488 

62495 62316 

62716 62607 

62865 62827 

63259 62978 



>4586098 
len = 



/34781 
2678 nex 



Init 
Intr 
Intr 
Intr 
Term 



65140 65679 

66213 66365 

66468 66604 

66694 66921 

67078 67817 



>4586098 
len = 



/15574 
2030 nex ■ 



Init 72537 73258 
60 Intr 73723 73790 
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Intr 73883 74014 
Intr 74091 74252 
Term 74331 74555 



35 



Init 
Intr 
Intr 
Intr 
Term 

>4586241 
len = 
Sngl 

>4586241 

len = 

Init 
Intr 
Intr 
Intr 
Intr 
Intr 
Intr 
Term 

>4586241 



len = 
>4586241 
4 0 len = 

Sngl 
>4586241 
len = 



45 



87367 
88069 



89142 
89266 



nex = 

87850 
88391 
89031 
89174 
89718 



11254 12386 



2376 

18202 
18521 
18671 
18884 
19370 
19523 
19641 
19852 



nex = 

18375 
18586 
18811 
18961 
19420 
19557 
19752 
20232 



736486 
1427 nex = 
726799 
474 nex = 
24750 24277 



Term 


47904 


47677 


Intr 


48038 


47983 


Intr 


48211 


48118 


Intr 


48426 


48296 


Intr 


48622 


48519 


Intr 


48849 


48764 


Intr 


49074 


49005 


Intr 


49267 


49159 


Intr 


49471 


49364 


Init 


49913 


49842 



>4586241 

60 



/34400 
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len = 


13 08 nex = 


2 






Init 


51838 52152 


+ 


0 




Term 


52518 53145 


+ 


0 


5 












>4586241 


737484 








len = 


4542 nex = 


2 




10 


Init 


72812 73062 


+ 


0 




Term 


77290 77353 


+ 


0 




>4586241 


/100613 






15 


len = 


4760 nex = 


3 






Init 


72812 73062 


+ 


0 




Intr 


77290 77355 


+ 


0 






77455 77571 


+ 


0 


20 












>4586241 


/516 








len = 


5258 nex = 


5 




25 


Init 


72812 73062 




0 




Intr 


77290 77355 


+ 


0 




Intr 


77455 77577 


+ 


0 






77692 77784 


+ 


0 




Term 


77856 78069 


+ 


0 


30 












>4586241 


/41510 








len = 


910 nex = 


2 




35 


Term 


75339 74906 




0 




Init 


75807 75569 




0 




>4586241 


/35907 






40 


len = 


1110 nex = 


5 






Init 


76894 77206 


+ 


0 




Intr 


77290 77355 


+ 


0 




Intr 


77455 77577 




0 


4 5 




77692 77784 




0 




Term 


77856 78003 


+ 


0 




>4586241 


/17520 






50 


len = 


790 nex = 


2 






Init 


8059 8299 


+ 


0 




Term 


8495 8848 


+ 


0 


55 


>4587582 


/125961 








len = 


49 0 nex = 


1 






Sngl 


13888 13411 




0 
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>4587582 

len = 

Term 
Intr 
Intr 
Init 

>4587582 
len = 
Sngl 

>4587582 

len = 

Init 
Term 

>4587582 

len = 

Term 
Init 

>4587582 

len = 

Term 
Init 

>4587582 

len = 



/31554 

1930 nex = 

21991 21883 

22243 22097 

22707 22622 

23280 23003 

74595 

1353 nex = 

23593 24945 

/803 

2112 nex = 

26649 27735 
28584 28760 

/16530 

1096 nex = 

29562 29488 
30202 30172 

/103045 

1270 nex = 

29562 29325 
30202 30172 

/10073 

1665 nex = 



Term 
Init 

>4587641 
len = 
Sngl 

>458764I 

len = 

Init 
Intr 
Intr 
Intr 
Term 



29562 28975 
30202 30172 

/19234 

749 nex = 

129286 128538 

/35921 

2548 nex = 

134987 135707 

135938 136040 

136602 136657 

136808 136972 

137280 137534 

/29173 
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Init 
Intr 
Intr 
Term 

>4587641 

len = 

Init 
Intr 
Intr 
Term 

>4587641 

len = 

Init 
Intr 
Intr 
Term 

>4587641 

len = 

Init 
Intr 
Intr 
Term 

>4587641 

len = 

Init 
Intr 
Term 

>4587641 

len = 

Term 
Init 

>4587641 

len = 

Sngl 

>4587641 



1616 nex = 

26708 26786 

27042 27199 

27523 27614 

27722 28051 

/20583 

1234 nex = 



26708 
27042 
27523 
27722 



26786 
27199 
27614 
27934 



/3997 
2393 nex ■■ 



31992 
32147 
32360 
32881 



32060 
32250 
32790 
33710 



79992 
1188 



57159 
57295 
57696 
57908 



nex = 

57189 
57437 
57818 
58346 



/41809 

1062 nex = 

60850 61072 
61195 61445 
61534 61911 

/35151 

64 9 nex = 

81448 81337 
81985 81802 

/12738 

359 nex = 

82158 81800 

/7074 



len = 

60 



690 nex = 



1 
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Sngl 

>4587641 

len = 

Term 
Init 

>4587677 

len = 

Term 
Intr 
Intr 
Init 

>4587986 

len = 

Sngl 

>4587986 

Sngl 

>4589409 

len = 

Init 
Term 

>4589409 
len = 
Sngl 

>4589409 

len = 

Term 
Intr 
Intr 
Init 

>4589409 

len = 

Init 
Intr 
Intr 
Intr 



82161 81802 

/108142 

831 nex = 

81448 81334 
82164 81802 

/42117 

1397 nex = 

4964 4716 

5403 5201 

5650 5565 

6112 5814 

/14922 

561 nex = 

19040 19589 

/6782 

4 00 nex = 

37458 37260 

/6760 

715 nex = 

15185 15608 
15705 15899 

/15004 

730 nex = 

22595 21875 

/29063 

1952 nex = 

23209 22949 

23499 23305 

23954 23842 

24900 24844 

/16621 

2096 nex = 

25892 26032 

26337 26445 

26520 26581 

26672 26775 
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Term 

>4589409 

len = 

Init 
Term 

>4589409 
len = 
Sngl 

>4589410 

len = 

Init 
intr 
Intr 
Term 

>4589410 

len = 

Init 
Intr 
Term 



26873 27227 

/28003 

166 3 nex = 

33218 33581 
34356 34880 

/32842 

1305 nex = 

36996 38300 

/41421 

19 95 nex = 

13570 13637 

13729 14122 

14271 14884 

14990 15286 

/6022 

1463 nex = 

13320 13637 
13729 14122 
14271 14782 



>4589410 

len = 

Init 
Intr 
Intr 
Intr 
Intr 
Intr 
Intr 
Intr 
Intr 
Term 

>4589410 

len = 

Init 
Intr 
Term 

>4589410 

len = 

Term 



/31579 
3250 nex 



25051 
25751 
25940 
26147 
26327 
26548 
26906 
27194 
27491 
27843 



25297 
25844 
26056 
26219 
26445 
26669 
27082 
27338 
27669 
28293 



732783 

716 nex = 

32322 32516 
32607 32735 
32829 32876 

/22152 

16 61 nex = 

40659 40201 
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Intr 
Intr 
Intr 
Init 

>4589410 

len = 

Sngl 

>4589410 

len = 

Sngl 

>4589410 

len = 

Sngl 

>4589410 

len = 

Init 
Term 

>4589411 
len = 
Sngl 

>4589411 
len = 



40967 40894 

41125 41044 

41359 41262 

41861 41719 

738797 

1392 nex = 

49398 50789 

/157709 

623 nex = 

53543 54165 

/21863 

1341 nex = 

63197 61857 

741153 

1351 nex = 

67494 68408 
68630 68844 

72612 

795 nex = 

11554 11467 

712760 

1958 nex = 





Term 


11151 


10758 




0 




Intr 


11356 


11253 




0 




Intr 


11566 


11467 




0 




Intr 


12211 


12152 




0 


45 


Intr 


12387 


12316 




0 




Init 


12562 


12491 




0 




>45894I1 


77631 






50 


len = 


2679 




9 






Term 


11151 


10921 




0 




Intr 


11356 


11253 




0 




Intr 


11566 


11467 




0 


55 


Intr 


12211 


12152 




0 




Intr 


12387 


12316 




0 




Intr 


12562 


12491 




0 




Intr 


12737 


12645 




0 




Intr 


12928 


12861 




0 


60 


Init 


13599 


13285 




0 
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>4589412 
len = 



/144945 
1238 nex ■■ 



Init 
Intr 
Intr 
Term 



13026 13085 

13230 13310 

13425 13664 

13744 14126 



>4589412 



/25062 



Term 
Intr 
Init 



14762 14637 
15769 15710 
16311 15854 



>4589412 



4909 



nex 



Init 
Intr 
Intr 
Intr 
Intr 
Intr 
Intr 
Intr 
Intr 
Intr 
Intr 
Intr 
Intr 
Intr 
Term 



17362 
17622 
17885 
18026 
18262 
18636 
18783 
19904 
20194 
20386 
20651 
20815 
21132 
21415 
21862 



17458 
17795 
17936 
18133 
18346 
18690 
19226 
19963 
20292 
20496 
20722 
20931 
21281 
21515 
22270 



>4589412 
len = 



792349 
1260 nex 



Term 
Intr 
Intr 
Init 

>4589412 

len = 



1333 917 

1647 1400 

2010 1852 

2176 2133 

/6550 

940 nex = 



Term 
Intr 
Init 



35193 34956 
35603 35471 
35895 35698 



>4589412 
len = 
Term 



/91769 
1450 nex = 
35193 34965 
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Intr 
Intr 
Intr 
Init 



Term 
Intr 
Intr 
Intr 
Intr 
Intr 
Intr 
Init 

>4589412 

len = 

Init 
Intr 
Term 

>4589412 

len = 

Sngl 

>4589414 

len = 

Sngl 

>4589414 

len = 

Sngl 

>4589414 

len = 

Init 
Intr 
Intr 
Intr 
Intr 
Intr 
Intr 
Intr 
Intr 
Term 



35603 35471 

35977 35698 

36224 36176 

36412 36339 



/37702 



2953 

38862 
39017 
39247 
39446 
39725 
39942 
40327 
41451 



nex = 

38499 
38958 
39138 
39339 
39558 
39819 
40270 
41110 



/19542 

859 nex = 

63381 63483 
63632 63707 
63802 64239 

726264 

600 nex = 

70812 70213 

/92314 

7 90 nex = 

27496 26709 

/21068 

860 nex = 
54424 55283 

/4I64B 
2894 



80194 
80547 
80904 
81081 
81271 
81484 
81673 
81860 
82124 
82304 



nex = 

80450 
80661 
80984 
81169 
81362 
81565 
81775 
81931 
82209 
83087 



60 >4589415 



/4507 
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Init 
Term 



>4589415 
len = 



nex 



Term 
Intr 
Intr 
Intr 
Intr 
Intr 
Init 

>4589415 

len = 



12799 
12999 
13168 
13342 
13544 
13699 
14105 



12501 
12887 
13088 
13290 
13436 
13640 
14023 



/159318 
1296 nex 



Term 
Intr 
Init 



17311 16832 
17535 17398 
18127 17694 



>4589415 
len = 



1837 nex = 



Init 
Intr 
Intr 
Intr 
Intr 
Term 



18863 
19422 
19664 
19856 
20135 
20375 



19024 
19565 
19755 
20042 
20290 
20699 



>4589418 
len = 



1367 



Term 
Intr 
Intr 
Init 



91 
499 
1228 
1367 



185 
970 
1302 



Term 
Intr 
Intr 
Intr 
Intr 
Intr 
Intr 
Intr 
Init 



2006 

12022 
12272 
12470 
12618 
12784 
12976 
13171 
13490 
13705 



nex = 

11700 
12099 
12369 
12554 
12701 
12904 
13070 
13371 
13592 
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>4589418 

len = 

Init 
Intr 
Intr 
Intr 
Intr 
Intr 
Intr 
Intr 
Intr 
Intr 
Intr 
Intr 
Term 

>4589419 
len = 
Sngl 

>4589419 

len = 

Init 
Intr 
Term 

>4589419 
len = 
Sngl 

>4589419 

len = 

Term 
Intr 
Intr 
Intr 
Intr 
Init 

>4589419 

len = 

Term 
Init 



3294 

21324 
21529 
21826 
22163 
22318 
22608 
22801 
22975 
23213 
23570 
23717 
23953 
24189 



21424 
21612 
21872 
22237 
22395 
22670 
22871 
23074 
23317 
23632 
23839 
24105 
24497 



/24081 

550 nex = 

32416 32957 

/106001 

790 nex = 

32416 32802 
32996 33100 
33137 33197 

724665 

730 nex = 

35180 34460 

/21922 

2840 nex = 

35177 34454 

35520 35348 

35757 35622 

36338 35874 

36722 36412 

37293 36801 

/10037 



42723 42430 
43108 42926 



len = 

60 



984 nex = 



5 
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Term 
Intr 
Intr 
Intr 
Init 

>4589421 

len = 



9429 9171 

9603 9520 

9780 9688 

9922 9850 

10144 10006 

/32718 

1303 nex = 



Term 


25886 


25513 


0 


Intr 


26219 


26136 


0 


Intr 


26411 


26319 


0 


Intr 


26577 


26505 


0 


Init 


26815 


26651 


0 



>45B9421 
len = 
Sngl 

>4589421 

len = 

Term 
Intr 
Init 

>4589421 
len = 
Sngl 

>4589423 

len = 

Term 
Init 

>4589423 

len = 

Term 
Init 

>4589425 
len = 
Sngl 

>4589427 
len = 



/5240 

641 nex = 

61499 62139 

/20850 

2056 nex = 

61249 60746 
61952 61661 
62801 62305 

/11329 
442 nex = 
80041 79600 

733833 

850 nex = 

14120 13771 
14612 14518 

736859 

854 nex = 

14120 13819 
14672 14518 

7113229 

511 nex = 

33439 32996 

71076 

49 8 nex = 
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Sngl 
>4589427 

5 len = 

Term 
Intr 
Intr 

10 Init 
>4589428 



25047 24550 

/17208 

1482 nex = 

25597 25201 

25959 25691 

26181 26057 

26540 26433 

/33021 



Init 
Intr 
Intr 
Intr 
Intr 
Intr 
Intr 
Intr 
Intr 
Intr 
Term 



3085 

11298 
12000 
12319 
12551 
12718 
13160 
13353 
13510 
13702 
13978 
14194 



nex = 

11398 
12227 
12366 
12627 
12840 
13264 
13425 
13581 
13894 
14107 
14382 



>4589428 

3 0 len = 

Init 
Intr 
Intr 

35 Intr 
Intr 
Intr 
Intr 
Intr 

40 Intr 
Term 



12000 
12319 
12551 
12718 
13160 
13353 
13510 
13702 
13978 
14194 



nex = 

12227 
12366 
12627 
12840 
13264 
13425 
13581 
13894 
14107 
14381 



>4589428 

45 len = 

Init 
Intr 
Intr 

5 0 Term 



1819 



nex 



27537 27623 

28018 28109 

28242 28315 

28403 29075 

/21695 

2374 nex = 



Init 
Intr 
Intr 
Intr 
Intr 



2805 3043 

3162 3288 

3751 3868 

3992 4128 

4215 4316 
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Intr 
Term 

>4589428 

len = 

Init 
Intr 
Term 

>4589428 

len = 

Term 
Intr 
Intr 
Init 

>4589428 

len = 

Sngl 

>4589430 

len = 

Sngl 

>4589430 

len = 

Init 
Intr 
Intr 
Intr 
Term 

>4589430 
len = 
Sngl 

>4589430 

len = 

Init 
Intr 
Intr 
Intr 
Term 



4412 4504 
4600 5178 

/1326 

1476 nex = 

37323 37772 
38054 38269 
38350 38798 

728772 

1258 nex = 

39781 39355 

39944 39891 

40129 40040 

40613 40330 

/22839 

4 96 nex = 

6020 5525 

/37580 

134 nex = 

20755 20888 

/35379 

1546 nex = 

36633 36700 

36804 37042 

37150 37290 

37379 37813 

37902 38175 

726766 

4 30 nex = 

37902 38175 

74357 

2437 nex = 

57993 58176 

58292 58405 

59163 59390 

59478 59917 

60008 60429 



>4589430 

60 



737065 
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len 



1885 nex ■ 



Init 
Intr 
Intr 
Term 



60646 60889 

61155 61335 

61583 61724 

61827 61896 



>4589430 
len = 



/110428 
535 nex 



Term 
Init 



72697 72339 
72873 72781 



>4589432 
len = 



Term 
Init 



/3332 
771 ne 



15502 15080 
15850 15746 



>4589432 
len = 



Init 
Term 



/123678 
1078 nex ■■ 



21735 22234 
22314 22812 



>4589432 
len = 



/40179 
2617 nex 



Term 
Intr 
Intr 
Intr 
Init 



43904 
44457 
44853 
45547 
45970 



43354 
44374 
44548 
45471 
45638 



>4589433 
len = 
Sngl 

>4589434 
len = 



/12315 
194 nex = 
10614 10421 
728462 
1771 nex = 



Init 
Term 



10665 10924 
11022 11140 



>4589434 
len = 



/104779 
16 39 nex ^ 



Init 
Term 



10667 10924 
11022 11140 



>4589434 

60 



/108581 
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1690 nex 



Init 
Intr 
5 Term 

>4589434 

len = 

10 

Init 
Term 



15 



25 



>4589434 



len = 
Sngl 

20 >4589434 
len = 



Term 
Intr 
Intr 
Intr 
Init 



30 >4589434 
len = 
Sngl 

35 

>4589434 

len = 

4 0 Term 
Intr 
Intr 
Intr 
Intr 

45 Intr 
Intr 
Init 



50 



>4589434 



len = 
Sngl 

55 >4589434 
len = 



10667 10924 
11022 11140 
11602 11648 

/33570 

1638 nex = 

10684 10924 
11022 11140 

/9336 
599 nex = 
1164 566 

/39717 

1792 nex = 

31362 31004 

31686 31448 

32066 31860 

32345 32156 

32795 32494 

76827 
2290 nex = 
37865 37353 

/7571 

2455 nex = 

43285 43064 

43481 43379 

43949 43881 

44126 44067 

44318 44225 

44720 44603 

44970 44809 

45518 45143 

/31680 

550 nex = 

46938 46393 

/12344 

2121 nex = 



Term 52868 52559 
60 Intr 54223 54128 
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Intr 
Init 



>4589434 



Sngl 

10 >4589434 



15 



Term 
Intr 
Intr 
Init 



Term 
Init 



>4589434 

len = 

3 0 Term 
Init 

>4589435 

35 len = 

Sngl 

>4589435 

len = 



40 



Init 
Intr 

4 5 Intr 

Intr 
Intr 
Intr 
Intr 

5 0 Intr 

Intr 
Term 

>4589435 



54396 54310 
54679 54472 

/15457 

626 nex = 

61850 61225 

/1876 

2436 nex = 

62504 62245 

63661 62602 

64223 64000 

64396 64311 

/159403 

850 nex = 

67516 66998 
67838 67591 

/15623 

1390 nex = 

76523 76081 
77465 76628 

724826 

610 nex = 

21600 20996 

76424 

2470 nex = 

29316 29493 

29595 29735 

29824 29872 

30020 30037 

30228 30343 

30437 30550 

30867 30955 

31183 31260 

31353 31460 

31549 31779 

/10035 

1247 nex = 



Init 37688 37775 
Intr 37871 38014 
60 Intr 38315 38407 
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Term 
>4589435 
len = 
Sngl 
>4589435 
len = 
Sngl 
>4589435 
len = 



38519 38934 
723983 
730 nex = 
42012 42734 
/22609 
459 nex = 
69525 69067 
793294 
1273 nex = 



Term 
Intr 
Intr 
Intr 
Init 



70107 
70459 
70608 
70794 
71191 



69919 
70342 
70552 
70727 
70923 



>4589436 
len = 



740348 
1614 nex 



Term 
Intr 
Intr 
Init 

>4589436 

len = 

Init 
Intr 
Intr 
Intr 
Intr 
Term 



16780 16494 

17196 17147 

17517 17283 

18107 17946 

76991 

3085 nex = 



29559 
29982 
30155 
30498 
30872 
31592 



29939 
30049 
30401 
30792 
31246 
32643 



>4589436 
len = 
Sngl 

>4589436 
len = 



72373 
769 nex = 
31994 32762 
734946 
2208 nex = 



Init 
Intr 
Intr 
Intr 
Intr 
Intr 



38601 
38850 
39015 
39199 
39507 
39820 



38660 
38896 
39108 
39346 
39719 
39954 
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Intr 
Intr 
Term 



40047 40181 
40269 40484 
40687 40808 



>4589437 
len = 



/42922 
1778 nex 



Term 
Init 



10942 10397 
11556 11034 



>4589437 
len = 
Sngl 

>4589437 
len = 



2456 



nex 



Term 
Intr 
Intr 
Intr 
Init 

>4589437 

len = 

Init 
Intr 
Intr 
Intr 
Intr 
Intr 
Term 



55692 55343 

55970 55836 

56286 56230 

56657 56523 

57335 57057 

/108568 



3071 

59166 
59799 
60223 
60986 
61230 
61440 
62046 



59592 
60132 
60263 
61027 
61351 
61548 
62236 



>4589437 

>4589437 
len = 
Sngl 

>4589437 

len = 

Init 
Intr 
Intr 
Intr 
Intr 
Intr 
Term 



9162 
9462 
9629 
9803 
10024 
10176 



nex = 

9041 
9359 
9551 
9718 
9883 
10069 
10328 
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>4589438 725545 

len = 1215 nex = 

Term 24727 24486 

Init 25352 25278 

>4589438 /30064 

len = 1570 nex = 

Init 29309 29481 

Intr 29563 29748 

Intr 29831 29993 

Intr 30080 30266 

Term 30349 30622 

>4589438 /21843 

len = 989 nex = 

Sngl 46380 47368 

>4589438 /22434 

len = 1518 nex = 

Init 5568 5639 

Intr 5947 6024 

Intr 6128 6406 

Term 6506 7066 

>4589439 /10004 

len = 463 nex = 

Sngl 40998 41460 

>4589439 /26026 

len = 670 nex = 

Sngl 46310 45649 

>4589439 /41488 

len = 97 0 nex = 

Init 47658 48053 

Term 48141 48323 

>4589439 723276 

len = 1822 nex = 

Term 63167 62825 

Intr 63423 63256 

Intr 63691 63515 

Intr 63954 63873 



Reference No. 2750-942P 



Intr 
Init 

>4589439 

len = 

Term 
Intr 
Intr 
Intr 
Init 

>4589439 

len = 

Sngl 

>4589439 

len = 

Sngl 

>4589440 

len = 

Init 
Intr 
Intr 
Intr 
Term 

>4589440 
len = 

>4589443 

len = 

Init 
Intr 
Intr 
Intr 
Intr 
Intr 
Intr 
Term 

>4589443 

len = 

Term 
Intr 
Intr 
Intr 



64223 64046 
64646 64414 



1950 nex = 

65270 64949 

65611 65475 

65803 65700 

66294 66199 

66470 66385 



79482 



67879 69007 
/42187 



1881 



nex 



1357 1831 

1938 2061 

2142 2278 

2717 2800 

2908 3237 

/114071 



2052 nex 



10117 
10710 
10911 
11125 
11323 
11525 
11688 
11918 



10334 
10812 
11018 
11234 
11430 
11570 
11819 
12161 



/40267 

1345 nex = 

32380 32218 

32541 32485 

32744 32640 

32964 32857 
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Init 


33319 


33125 


_ 




>4589443 


/144066 




5 


len = 


1510 


nex = 


3 




Term 


37967 


37540 






Intr 


38213 


38133 






Init 


39044 


38777 


_ 


10 












>4589444 


/40174 






len = 


2138 


nex = 


7 


15 


Init 


5581 


5696 


+ 




Intr 


6020 


6178 


+ 




Intr 


6278 


6453 


+ 




Intr 


6565 


6842 


+ 




Intr 


6928 


7063 


+ 


20 


Intr 


7152 


7263 


+ 




Term 


7457 


7718 


+ 




>4589444 


/120707 




25 


len = 


1653 


nex = 


5 




Init 


68700 


68826 


+ 




Intr 


68906 


69190 


+ 




Intr 


69489 


69569 


+ 


30 


Intr 


69673 


69767 


+ 




Term 


69988 


70352 


+ 




>4589444 


/10687 




35 


len = 


582 


nex = 


2 




Init 


7159 


7263 


+ 




Term 


7457 


7740 


+ 


40 


>4589444 


/13338 






len = 


1435 


nex = 


2 




Term 


71110 


70333 




45 


Init 


71767 


71187 






>4589444 


/12533 






len = 


1612 


nex = 


3 


50 












Term 


72674 


72238 






Intr 


73073 


72743 






Init 


73849 


73302 




55 


>4589444 


7327 






len = 


1816 


nex = 


5 



Init 74986 75176 
60 Intr 75871 75973 
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Intr 
Intr 
Term 



76059 76148 
76241 76339 
76490 76801 



>4589445 
len = 



Init 
Term 



139 
1027 



>4589445 
len = 



738265 
2260 nex 



Init 
Intr 
Intr 
Intr 
Intr 
Intr 
Intr 
Intr 
Term 



19710 
20156 
20310 
20491 
20692 
20926 
21150 
21401 
21645 



19895 
20228 
20392 
20593 
20848 
21054 
21236 
21445 
21969 



>4589445 
len = 
Sngl 

>4589445 
len = 



23376 22781 
/40717 
1171 nex = 



Term 
Intr 
Intr 
Intr 
Intr 
Init 

>4589445 



59227 59185 

59432 59317 

59660 59559 

59846 59745 

60036 59926 

60355 60179 

/6617 



1852 



nex 



Init 
Intr 
Intr 
Term 



79257 
79428 
80505 
80741 



79346 
79703 
80663 
81108 



2682 



nex ■ 



Term 
Intr 
Intr 
Intr 
Intr 



14307 
14445 
14633 
14989 
15212 



14077 
14401 
14514 
14901 
15134 
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Intr 
Intr 
Init 

>4589446 

len = 

Init 
Intr 
Intr 
Term 

>4589450 

len = 

Init 
Intr 
Intr 
Intr 
Intr 
Term 

>4589950 

len = 

Init 
Intr 
Intr 
Intr 
Term 

>4589950 

len = 

Sngl 

>4589950 

len = 

Sngl 

>4589950 

len = 

Init 
Intr 
Intr 
Term 

>4589950 

len = 

Init 



15616 15454 
15937 15858 
16118 16011 

/38867 

1907 



17211 
18004 
18241 
18625 



nex = 

17769 
18141 
18490 
19117 



/124077 
1690 



6092 
6597 
7007 
7347 
7529 
7744 



nex = 

6223 
6755 
7256 
7436 
7617 
7770 



/6704 
2072 



15942 
16230 
16673 
16932 
17165 



nex = 

16097 
16315 
16798 
17059 
17489 



/152803 
611 nex = 
23934 24544 
/38094 
326 nex = 
24221 24542 
/20770 
1643 nex = 



27913 
28278 
28594 
29119 



28123 
28494 
28778 
29555 



/17664 
2673 nex = 
30408 30899 
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Intr 
Intr 
Term 

5 >4589950 

len = 

Init 

10 Intr 
Intr 
Intr 
Term 

15 >4589950 
len = 
Term 

2 0 Init 
>4589950 
len = 

25 

Term 
Init 



>4589950 



30 



len = 
Sngl 

35 >4589969 
len = 
Init 

40 Term 
>4646215 
len = 

45 

Init 
Intr 
Term 

50 >4646215 
len = 
Sngl 

55 

>4646229 
len = 
60 Init 



31702 32034 
32610 32726 
32854 33080 

/21841 

1630 nex = 

43014 43406 

43536 43676 

43768 43908 

44011 44117 

44298 44634 

/11283 

1270 nex = 

3662 3371 
4626 4390 

/100317 

1255 nex = 

3662 3416 
4670 4390 

/41633 
569 nex = 
50817 51385 

/120267 

610 nex = 

40423 40571 
40797 41025 

734289 

776 nex = 

19221 19325 
19412 19533 
19667 19993 

79248 

8 67 nex = 

21032 20166 

7103197 

751 nex = 

3186 3298 
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Term 
>4646229 
len = 
Sngl 
>4646229 
len = 



3509 3709 
/19631 
691 nex = 
56338 57028 
72267 
2371 nex = 



Init 
Intr 
Intr 
Intr 
Intr 
Intr 
Term 



59713 
59944 
60076 
60872 
61359 
61583 
61829 



59825 
59982 
60125 
60983 
61406 
61713 
61997 



>4646229 

Sngl 
>4662609 
len = 



76261 
408 nex = 
64632 65028 
/24161 
1541 nex = 



Init 
Intr 
Intr 
Intr 
Term 

>4662609 

len = 



110615 110883 

mill 111219 

111307 111437 

111538 111612 

111789 112155 

7796 

618 nex = 



Init 
Term 



110728 110883 
mill 111219 



>4662609 
len = 



/18583 
1090 



nex ■ 



Init 
Intr 
Intr 
Intr 
Intr 
Term 



11288 
11708 
11922 
12074 
12253 
12327 



11484 
11848 
12002 
12198 
12295 
12377 



>4662609 
len = 



732558 
2255 nex 



Init 
Intr 



119155 119438 
119536 119598 



Reference No. 2750-942P 



Intr 
Intr 
Term 

>4662609 

len = 



120052 120472 
120567 120718 
120832 121409 

/103288 

1570 nex = 



Init 
Intr 
Intr 
Intr 
Intr 
Term 

>4662609 

Len = 

Term 
Init 

>4662609 
len = 
Sngl 

>4662628 

len = 

Init 
Intr 
Intr 
Term 

>4662628 

len = 

Term 
Init 

>4662628 
len = 
Sngl 

>4662637 

len = 

Term 
Intr 
Intr 
Intr 
Intr 
Intr 



42796 
43094 
43358 
43556 
43725 
43910 



42919 
43245 
43467 
43641 
43829 
44068 



/7803 
379 ne 



58481 
58640 



58268 
58572 



/34358 

873 nex = 

75547 76419 

/9376 

1657 nex = 

31339 31677 

31931 32188 

32461 32664 

32749 32995 

/154050 

1099 nex = 

36454 36398 
37496 37156 

/20900 
1390 nex = 
7708 9057 

/20182 

2859 nex = 

28149 27892 

28599 28528 

28829 28763 

29070 28932 

29862 29688 

30330 30177 
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Init 

>4662640 

len = 

Sngl 

>4662640 

len = 

Sngl 

>4662640 

len = 

Term 
Init 

>4662640 

len = 

Term 
Init 

>4662640 
len = 
Sngl 

>4662640 

len = 

Term 
Intr 
Init 

>4678196 

len = 

Term 
Intr 
Intr 
Init 

>4678196 

Term 
Intr 
Intr 
Init 



30750 30535 

/103735 

670 nex = 

1875 1570 

/13193 

741 nex = 

24672 23932 

/108284 

1342 nex = 

1875 1599 
2933 2230 

732647 

1474 nex = 



1875 
3144 



1671 
2230 



/111177 

370 nex = 

3147 2785 

/32660 

2432 nex = 

7037 6524 
7476 7292 
8955 8479 

71940 

830 nex = 

30655 30442 

30869 30753 

31063 30949 

31271 31146 

730469 

858 nex = 

30655 30458 

30869 30753 

31063 30949 

31315 31146 
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>4678196 
len = 



984 



Term 
Intr 
Intr 
Init 



30655 30384 

30869 30753 

31063 30949 

31367 31146 



>4678196 
len = 



/994 



1046 nex ■■ 



Term 
Intr 
Intr 
Init 

>4678219 
len = 
Sngl 

>4678219 
len = 



7202 6888 

7473 7304 

7664 7566 

7933 7778 

/21113 

414 nex = 

16445 16858 

/124189 

1617 nex = 



Init 
Intr 
Intr 
Intr 
Intr 
Term 



17224 17328 

17546 17641 

17745 17806 

18172 18305 

18426 18510 

18608 18840 



/10500 



Init 
Intr 
Term 



18935 19234 
19376 19429 
19531 19898 



>4678219 

len = 

Init 
Intr 
Intr 
Intr 
Intr 
Term 

>4678219 

len = 



14 9 8 nex = 

20212 20321 

20661 20780 

20859 20904 

21091 21176 

21267 21396 

21486 21709 

/1918 



init 22504 22649 
60 Intr 22728 22814 
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Intr 22918 23020 
Intr 23256 23322 
Term 23414 23593 



len - 

Term 
Intr 
Intr 
Intr 
Intr 
Intr 
Intr 
init 

>4678258 

len = 

Term 
Intr 
Intr 
Intr 
Intr 
Intr 
Intr 
Intr 
Intr 
Init 

>4678258 

len = 

Sngl 

>4678266 

len = 

Sngl 

>4678266 

len = 

Sngl 

>4678266 

len = 

Init 
Intr 
Intr 
Term 



35527 
35817 
36025 
36381 
36617 
36786 
37244 
37583 



2950 

38519 
38936 
39250 
39398 
39714 
39870 
40128 
40328 
40471 
40791 



nex = 

35138 
35649 
35917 
36162 
36472 
36696 
37130 
37343 



nex = 

37849 
38848 
39065 
39336 
39532 
39822 
39990 
40247 
40416 
40667 



/30342 

533 nex = 

88640 89172 

/19033 

577 nex = 

1664 2240 

722956 

642 nex = 

4832 5473 

/13259 

2274 nex = 

73401 73742 

73945 74637 

74813 75124 

75360 75674 



60 >4678266 



7332 
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len 



Sngl 
>4678266 
len = 



77266 76186 
/27474 



2792 



nex 



Term 
Intr 
Intr 
Intr 
Intr 
Intr 
Intr 
Intr 
Intr 
Init 



90126 
90502 
90835 
91028 
91211 
91432 
91850 
91990 
92372 
92566 



89784 
90362 
90594 
90903 
91118 
91331 
91778 
91933 
92188 
92480 



>4678291 
len = 



1917 



nex 



Init 
Intr 
Intr 
Intr 
Intr 
Intr 
Term 



17191 
17368 
17968 
18232 
18427 
18635 
18898 



17291 
17833 
18085 
18341 
18512 
18780 
19107 



>4678291 
len = 



/30751 
134 4 nex 



Init 
Intr 
Intr 
Intr 
Intr 
Intr 
Term 



22561 22613 

22732 22806 

22896 22964 

23038 23118 

23250 23345 

23435 23491 

23601 23896 



>4678291 
len = 
Sngl 

>4678291 
len = 



/33606 
446 nex = 
23601 23896 
75493 
1270 nex = 



Init 
Term 



>4678291 



27193 27656 
27889 28459 



737225 
2003 nex 
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Init 42970 43714 
Intr 43777 43917 
Term 44045 44522 



Init 
Intr 
Term 



46242 46546 
46860 46962 
47050 47174 



len = 

Term 
Intr 
Intr 
Intr 
Intr 
Intr 
Intr 
Init 



2552 

66537 
66874 
67068 
67487 
67678 
67904 
68418 
68829 



nex = 

66278 
66788 
66979 
67382 
67584 
67821 
68343 
68689 



>4678291 

len = 

Term 
Intr 
Intr 
Intr 
Intr 
Intr 
Init 

>4678315 

len = 

Init 
Intr 
Intr 
Intr 
Term 

>4678315 

len = 

Sngl 

>4678315 

len = 

Sngl 



2530 

66537 
66874 
67068 
67487 
67678 
67904 
68422 



nex = 

66335 
66788 
66979 
67382 
67584 
67821 
68343 



734579 
199 0 nex 



13656 
13988 
14162 
14367 
15188 



13884 
14067 
14239 
14437 
15636 



7787 
324 nex = 
18778 19101 
7147241 
341 nex = 
2959 3299 



Reference No. 2750-942P 



1502 

>4678315 /34551 

len = 1545 nex = 7 

5 Term 41502 41430 - 0 
Intr 41669 41604 - 0 
Intr 41835 41771 - 0 
Intr 41994 41931 - 0 
Intr 42197 42088 - 0 

10 Intr 42483 42396 - 0 

Init 42974 42809 - 0 

>4678315 /9581 

15 len = 1336 nex = 2 

Init 46422 46595 + 0 

Term 47332 47757 + 0 

20 >4678340 /12236 

len = 1954 nex = 5 

Term 22952 22722 - 0 

25 Intr 23125 23045 - 0 

Intr 23705 23209 - 0 

Intr 24098 23940 - 0 

Init 24675 24184 - 0 

30 >4678340 726537 

len = 2021 nex = 8 

Term 22952 22713 - 0 

35 Intr 23125 23045 - 0 

Intr 23255 23209 - 0 

Intr 23445 23350 - 0 

Intr 23705 23639 - 0 

Intr 24098 23940 - 0 

40 Intr 24254 24184 - 0 

Init 24733 24323 - 0 

>4678340 /13870 

45 len = 1640 nex = 3 

Term 3619 3273 - 0 

Intr 3881 3700 - 0 

Init 4912 4243 - 0 

50 

>4678371 /21034 

len = 971 nex = 2 

55 Init 109206 109301 + 0 

Term 109446 109715 + 0 

>4678371 /113114 

6 0 len = 892 nex = 2 



Reference No. 2750-942P 



Init 109206 109301 
Term 109446 109693 



Term 41665 41333 

Intr 42181 42112 

Init 42443 42246 

>4678371 /41730 

len = 44 0 nex = 

Sngl 48928 49367 

>4678705 /41543 



Init 
Intr 
Term 



11432 11814 
12077 12267 
12387 12550 



>4678705 

len = 

Init 
Intr 
Intr 
Intr 
Intr 
Intr 
Term 



1690 nex = 

14667 14750 

14858 15075 

15261 15331 

15428 15529 

15646 15763 

15929 16064 

16150 16346 



>4678705 



/125324 



Init 
Intr 
Intr 
Intr 
Intr 
Intr 
Term 



2171 nex = 

16715 16857 

17191 17237 

17759 17858 

18087 18150 

18237 18353 

18453 18561 

18819 18885 

/24780 

791 nex = 



Init 
Term 



25236 25351 
25594 26026 



60 len = 



2675 nex = 



5 



Reference No. 2750-942P 



Term 
Intr 
Intr 

5 Intr 
Init 

>4678705 

10 len = 

Sngl 

>4678705 

15 

len = 
Sngl 

20 >4678705 
len = 



25 



Term 
Intr 
Init 



>4680765 
3 0 len = 

Sngl 
>4680765 

35 

len = 

Init 
Intr 

40 Intr 
Term 

>4680765 

4 5 len = 

Term 
Intr 
Intr 

50 Intr 
Intr 
Init 

>4680765 



46430 45826 

46602 46534 

46744 46682 

46896 46840 

48500 47963 

727536 

281 nex = 

50278 49998 

/8207 

1465 nex = 

51349 49885 

/15484 

64 6 nex = 

68931 68742 
69161 69058 
69387 69316 

/19054 

658 nex = 

103945 104602 

/12712 

1157 nex = 

2888 2958 

3052 3180 

3402 3584 

3670 4024 

729972 

1551 nex = 

4691 4427 

4849 4793 

4976 4932 

5238 5176 

5385 5322 

5977 5762 

7112559 

1614 nex = 



60 



Term 
Intr 
Intr 



4691 
4849 
5238 



4427 
4793 
4932 
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Intr 5385 5322 

Init 6040 5762 

>4680765 /33173 

5 

len = 2179 nex = 

Term 87951 87650 

Intr 88636 88413 

10 Intr 88934 88817 

intr 89343 89032 

Init 89828 89505 

>4680765 723348 

15 

len = 2238 nex = 



Term 
Intr 

2 0 Intr 

Intr 
Init 

>4680765 

25 

len = 

Term 
Intr 

3 0 Intr 

Intr 
Init 

>4689466 

35 

len = 
Sngl 

40 >4691223 
len = 
Term 

4 5 Intr 
Intr 
Intr 
Init 

50 >4691223 
len = 



87951 87633 

88636 88413 

88934 88817 

89343 89032 

89870 89505 

/7D57 

3012 nex = 

87951 87675 

88636 88413 

88934 88817 

89343 89032 

89750 89505 

/31988 

610 nex = 

44480 45086 

/19796 

1654 nex = 

118925 118453 

119112 119052 

119318 119145 

119652 119563 

120106 119970 

/33058 

1390 nex = 



init 120319 120641 
55 Term 121277 121707 

>4691223 /7104 



len = 2650 nex = 

60 



Reference No. 2750-942P 



Term 


27184 


26876 


Intr 


27622 


27443 


Intr 


27988 


27700 


Intr 


28141 


28074 


Intr 


28414 


28229 


Intr 


28667 


28503 


Init 


29523 


28767 



>4691223 797866 

10 

len = 654 nex = 

Term 53529 53241 

Intr 53668 53616 

15 Init 53894 53745 

>4691223 /118329 

len = 507 nex = 

20 

Sngl 61751 62257 

>4591223 /29714 

25 len = 473 nex = 

Term 67718 67641 

Intr 67897 67808 

Init 68113 67979 

30 

>4691223 /115554 

len = 379 nex = 

35 Sngl 72403 72781 

>4691223 /20915 

len = 3232 nex = 

40 



Init 


78459 


78663 


Intr 


79140 


79266 


Intr 


79400 


79506 


Intr 


79713 


79874 


Intr 


79963 


80109 


Intr 


80186 


80380 


Intr 


80488 


80666 


Term 


80745 


80975 



50 >4691223 /9841 

len = 523 nex = 

Sngl 81514 80992 

55 

>4699904 /21639 

len = 1521 nex = 



60 Term 29454 28950 



Reference No. 2750-942P 



15 



Init 

>4699904 

5 len = 

Term 
Intr 
Intr 

10 Init 
>4699904 
len = 
Sngl 
>4699904 
2 0 len = 

Sngl 
>4699904 

25 

len = 
Sngl 

30 >4699904 
len = 
Sngl 
>4699904 
len = 



35 



40 



Init 
Intr 
Intr 
Term 

45 >4713943 
len = 
Init 

5 0 Intr 
Intr 
Intr 
Intr 
Term 



55 

>4713943 
len = 
60 Init 



30470 29953 

/30530 

2040 nex = 

32070 31765 

32214 32162 

33121 33047 

33255 33215 

/37701 

790 nex = 

38588 37803 

/30113 

7 90 nex = 

38588 37804 

/105950 

412 nex = 

38588 38177 

/9278 

310 nex = 

38588 38288 

78446 

1770 nex = 

82618 82845 

83226 83301 

83389 83561 

84124 84387 

738549 

217 0 nex = 

11662 11887 

12188 12265 

12354 12614 

12703 13020 

13129 13287 

13447 13623 

719103 

3190 nex = 

18821 18995 
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Intr 




19673 


Intr 


19753 


19791 


Intr 


19928 


20028 




20168 


20280 


Intr 


20369 


20468 


Intr 


20547 


20693 


Intr 


20774 


20834 


Intr 


20925 


21022 


Intr 


21110 


21252 


Intr 


21361 


21444 


Intr 


21529 


21618 


Term 


21702 


22007 



>4713943 /10072 

15 

len = 1233 nex = 



Init 


20810 


20834 


Intr 


20925 


21022 


Intr 


21110 


21252 


Intr 


21361 


21444 


Intr 


21529 


21618 


Term 


21702 


22042 



25 >4713943 /21947 

len = 1195 nex = 

Init 28706 29150 

30 Term 29228 29900 

>4713943 /16095 

]_en = 706 nex = 

35 

Sngl 69411 70116 

>4713943 /155207 

40 len = 1067 nex = 

Init 73695 73912 

Term 74344 74761 

45 >4725940 /20383 

len = 687 nex = 

Sngl 106710 107396 

50 

>4725940 /6554 

len = 614 nex = 

55 sngl 110711 111324 

>4725940 /92350 

len = 1090 nex = 

60 



Reference No. 2750-942P 



Term 
Intr 
Intr 
Init 

>4725940 
len = 
Sngl 

>4732167 

len = 

Init 
Intr 
Intr 
Intr 
Term 

>4732168 
len = 
Sngl 

>4732168 

len = 

Term 
Intr 
Intr 
Intr 
Intr 
Init 

>4732168 

len = 

Term 
Init 

>4732169 

Sngl 
>4732159 

Sngl 
>4733952 



17635 17313 

17915 17751 

18084 18006 

18376 18165 

/28572 

53 7 nex = 

96158 95622 

/2296 

1813 



45294 
45586 
46139 
46531 
46828 



nex = 

45510 
45739 
46198 
46591 
47106 



/8387 
404 nex = 
120482 120649 
738543 
1971 nex = 



79492 
79689 
79895 
80126 
80429 
81022 



79052 
79588 
79770 
79992 
80313 
80821 



/40690 

670 nex = 

80429 80357 
81025 80821 

/10692 

532 nex = 

52688 52157 

/25430 

550 nex = 

52692 52146 

/37506 



60 len = 



2578 nex = 



7 



Reference No. 2750-942P 



Init 
intr 
Intr 
Intr 
Intr 
Intr 

>4733952 

len = 

Init 
Term 

>4733952 

len = 

Init 
Intr 
Term 

>4733952 

len = 



Init 
Intr 
Term 

>4733952 

len = 

Term 
Intr 
Intr 
Intr 
Intr 
Init 

>4733952 

Sngl 
>4733952 
len = 
Sngl 
>4733953 
len = 
Sngl 



117152 117890 

117977 118056 

118424 118501 

118902 118997 

119102 119227 

119304 119369 

119460 119729 

/11975 

1396 nex = 

2301 2629 
3230 3696 

/10295 

1244 nex = 

4419 5051 
5122 5416 
5495 5662 

/109289 

1232 nex = 

4431 5051 
5122 5416 
5495 5662 

/36621 

1918 



86725 
87006 
87271 
87467 
87678 
88281 



nex = 

86366 
86815 
87194 
87359 
87564 
87885 



/28602 
599 nex = 
97194 97792 
732795 
719 nex = 
97257 97975 
/106946 
4 7 3 nex = 
118125 117810 



Reference No. 2750-942P 



>4733953 
len = 



Term 


10729 


10095 


0 


Intr 


10883 


10802 


0 


Intr 


11125 


11042 


0 


Intr 


11384 


11217 


0 


Intr 


11617 


11461 


0 


Init 


12193 


11723 


0 



>4733953 

len = 

Term 
Init 

>4733953 

len = 

Sngl 

>4733953 

len = 

Sngl 

>4733953 

len = 

Init 
Intr 
Term 

>4733953 

len = 

Term 
Intr 
Intr 
Init 



Init 
Intr 
Intr 
Intr 
Intr 
Intr 
Intr 
Intr 



1239 nex = 

16338 16088 
17326 17040 

/2629 

1270 nex = 

16338 16093 

/39404 

379 nex = 

17961 17583 

/19759 

2110 nex = 

20860 21383 
21459 21893 
22639 22966 

/30753 

1171 nex = 



60229 
60602 
60835 
61029 



59859 
60530 
60709 
60968 



737277 
283 0 nex ■■ 



61852 
62673 
62958 
63137 
63340 
63487 
63686 
63865 



62588 
62871 
63050 
63265 
63393 
63564 
63748 
63977 



Reference No. 2750-942P 



Term 

>4733953 

len = 

Init 
Intr 
Term 

>4733953 

len = 

Term 
Intr 
Intr 
Init 

>4733957 

len = 

Term 
Init 

>4733984 

len = 

Init 
Intr 
Term 

>4733984 

len = 

Init 
Intr 
Term 

>4734003 
len = 
Sngl 

>4734003 

len = 

Init 
Intr 
Intr 
Intr 
Intr 
Term 



64059 64359 

/101924 

73 0 nex = 

73 181 
262 357 
434 801 

/32143 

1905 nex = 



97964 
98361 
98605 
99481 



97577 
98053 
98433 
99057 



/125929 

9 70 nex = 

7176 6924 
7349 7257 

726824 

2050 nex = 

47232 47598 
47718 47928 
48007 48272 

729669 

1938 nex = 

47232 47598 
47718 47928 
48007 48229 

793895 

533 nex = 

20795 21327 

732391 

2026 nex = 



45023 
45415 
46054 
46221 
46354 
46863 



45326 
45959 
46140 
46277 
46798 
47048 



60 >4734003 



79500 
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Term 70600 70084 
Init 70885 70675 



len ■■ 



1210 nex ^ 



Term 69956 69706 

Intr 70115 70043 

Intr 70462 70257 

Intr 70600 70537 

Init 70912 70675 

53195 /5767 



Term 102330 101848 
Init 102742 102437 



len = 

Term 
Intr 
Intr 
Intr 
Init 



1902 



nex ' 



24230 23691 

24471 24314 

24713 24567 

25027 24796 

25592 25383 

77837 



Init 
Intr 
Intr 
Intr 
Intr 
Intr 
Intr 
Intr 
Term 

>4753645 

len = 

Sngl 

>4753645 

len = 

Sngl 



2021 

41032 
41510 
41693 
41892 
42064 
42268 
42443 
42608 
42827 



nex = 

41313 
41597 
41806 
41974 
42178 
42359 
42508 
42701 
43052 



742927 
889 nex = 
15300 14412 
7748 
912 nex = 
15333 14422 



60 >4753645 



711649 



Reference No. 2750-942P 



1514 







778 


nex = 


1 






Sngl 


20284 


19507 


- 


0 


5 














>4753645 


/18082 








len = 


1035 


nex = 


3 




10 


Term 


25637 


25286 




0 




Intr 


26018 


25864 


_ 


0 




Init 


26320 


26110 




0 




>4753645 


/4311 






15 














len = 


991 


nex = 


4 






Term 


31713 


31426 




0 




Intr 


31904 


31800 




0 


20 




32064 


32003 




0 




Init 


32416 


32195 




0 




>4753645 


/121013 






25 


len = 


1018 


nex = 


3 








33198 


32799 




0 




Intr 


33586 


33527 




0 




Init 


33816 


33685 




0 


30 














>4753645 


/20847 








len 


1904 


nex = 


8 




35 


Term 


41741 


41372 




0 






42028 


41836 








Intr 


42227 


42105 




0 




Intr 


42440 


42301 


- 


0 




Intr 


42595 


42527 




0 


40 


Intr 


42881 


42686 


- 


0 




Intr 


43051 


42977 




0 




Init 


43275 


43144 


- 


0 




>4755178 


/7413 






45 














len 


550 


nex = 


1 






Sngl 


23401 


23949 


+ 


0 


50 


>4755179 


/37863 








len = 


2208 


nex = 


4 






Term 


38083 


37560 




0 


55 


Intr 


38382 


38183 




0 




Intr 


38861 


38468 




0 




Init 


39767 


39548 




0 



>4755179 

60 



/33047 
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len = 


477 


nex = 


1 


Sngl 


45523 


45047 


- 


>4755179 


/13002 




len = 


1676 


nex = 


2 


Init 


50115 


50339 


+ 


Term 


50656 


51087 


+ 


>4755179 


/125387 




len = 


774 


nex = 


2 


Init 


50164 


50339 


+ 




50656 


50937 


+ 


>4755179 


/40183 




len = 


2251 




7 


Term 


73491 


73290 




Intr 


73894 


73775 


- 


Intr 


74166 


73981 




Intr 


74402 


74263 


- 


Intr 


75048 


74800 




Intr 


75246 


75129 


- 


Init 


75540 


75409 




>4755179 


/4933 




len = 


1523 


nex = 


0 


>4755179 


/40305 




len = 


1342 


nex = 


1 


Sngl 


82160 


83501 


+ 


>4755179 


/112223 




len = 


279 


nex = 


1 


Sngl 


86044 


85766 




>4755179 


/230 




len = 


2782 


nex = 


8 


Term 


90030 


89591 


_ 


Intr 


90265 


90182 




Intr 


90522 


90366 




Intr 


90939 


90674 




Intr 


91320 


91048 




Intr 


91530 


91419 




Intr 


91829 


91610 




Init 


92372 


91910 





60 >4755179 



/113514 
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2378 



Term 
Intr 
Intr 
Intr 
Intr 
Init 

>4755185 

len = 

Term 
Intr 
Init 

>4755185 

len = 

Sngl 
>4755185 



7459 
7721 
7889 
8124 
8629 
9601 



7240 
7576 
7809 
7978 
8530 
9467 



/27930 

861 nex = 

104138 103879 
104520 104451 
104739 104586 

/4047 

88 5 nex = 

104763 103879 
/5198 



Term 
Intr 
Init 

>4755185 
len = 

>4755185 
len = 
Sngl 

>4757388 
len = 



Term 
Intr 
Init 

>4757388 

len = 

Init 
Term 

>4757390 

len = 



104138 103878 
104520 104451 
104780 104586 

/99800 

9 96 nex = 

733377 

16 06 nex = 

89482 90543 

/34183 

9 96 nex = 

686 30 
828 758 
1025 911 

/20852 

1221 nex = 

26560 26673 
26774 27780 

/37617 

2548 nex = 
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Term 
Intr 
Init 

>4757390 
len = 
Sngl 

>4757390 
len = 



13877 13660 
14019 13968 
16207 15422 

/109272 

528 nex = 

37853 38380 

/100074 



519 



nex 



Sngl 
>4757390 
len = 



37862 38380 
/103168 
1856 nex = 



Init 
Intr 
Intr 
Term 



41052 41145 

41501 41594 

42124 42386 

42471 42907 



/2102 



3220 



nex = 



Term 
Intr 
Intr 
Intr 
Intr 
Intr 
Intr 
Intr 
Intr 
Intr 
Intr 
Intr 
Init 



18611 
18807 
19090 
19281 
19487 
19668 
19823 
19989 
20445 
20669 
21040 
21231 
21413 



18194 
18736 
18977 
19168 
19387 
19590 
19749 
19927 
20365 
20544 
20916 
21126 
21349 



>4757392 

Term 
Intr 
Init 

>4757392 

len = 



/31656 

970 nex = 

28052 27639 
28365 28148 
28605 28452 

796588 

590 nex = 



Term 
Init 



41513 41072 
41654 41598 



>4757392 

60 



/13932 
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1822 nex ■ 



Term 
Intr 
Intr 
Intr 
Init 

>4757392 

len = 

Term 
Intr 
Intr 
Init 

>4757392 

len = 

Sngl 

>4757392 

len = 

Sngl 

>4757392 

>4757392 

len = 

Init 
Intr 
Intr 
Term 

>4757392 

len = 

Init 
Intr 
Intr 
Term 

>4757392 
len = 
Sngl 

>4757392 
len = 



48918 48477 

49253 49006 

49533 49327 

49777 49615 

50298 50008 

736559 

1821 nex = 

48918 48478 

49253 49006 

49533 49327 

49777 49615 

72204 

550 nex = 

5387 4851 

79425 

579 nex = 

57089 57667 

7111154 

1092 nex = 

726899 

2182 nex = 

73097 73405 

73958 74660 

74754 74910 

74995 75278 

712456 

1995 nex = 

73287 73405 

73958 74660 

74754 74910 

74995 75281 

735237 

254 nex = 

75025 75278 

716548 

1845 nex = 
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Term 
Intr 
Intr 
Intr 
Init 

>4757395 

len = 

Term 
Init 

>4757395 

len = 

Init 
Intr 
Intr 
Term 

>4757395 

len = 

Init 
Intr 
Term 

>4757396 

len = 

Sngl 

>4757396 

len = 

Sngl 

>4757399 

len = 

Sngl 

>4757400 

len = 

Term 
Init 



85494 84859 

85695 85618 

86164 86108 

86314 86260 

86703 86408 

/15932 

801 nex = 

42705 42137 
42937 42787 

/13267 

1911 nex = 

53364 54057 

54543 54725 

54857 55024 

55121 55274 

/118150 

1918 nex = 

55805 56142 
56228 56443 
57409 57722 

/4410 

471 nex = 

15964 16434 

/115489 

459 nex = 

15976 16434 

/115876 

341 nex = 

1954 2294 

/11763 

1275 nex = 

74376 74009 
75283 74843 



len = 

60 



2249 nex = 
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Term 
Intr 
Intr 
Intr 
Intr 
Intr 
Intr 
Init 

>4757401 

len = 

, Term 
Intr 
Init 

>4757401 

len = 

Sngl 

>4757401 

len = 

Sngl 

>4757401 

len = 

Term 
Init 

>4757401 
len = 
Sngl 

>4757401 

len = 

Init 
Intr 
Intr 
Intr 
Intr 
Term 

>4757403 

len = 



80687 
80926 
81202 
81363 
81868 
82022 
82194 
82408 



80160 
80866 
81027 
81306 
81764 
81965 
82108 
82279 



/924 

1438 nex = 

13076 12445 
13565 13169 
13882 13663 

/103058 

550 nex = 

22797 22249 

/115850 

614 nex = 

26499 25886 

/27707 

716 nex = 

27638 27345 
28060 27979 

/124616 

645 nex = 

32214 31570 

727482 

2410 



75050 
75597 
75850 
76313 
76564 
76891 



nex = 

75395 
75744 
76110 
76357 
76732 
77454 



/13058 
1510 nex 



Term 15873 15797 
Intr 16028 15960 
60 Intr 16200 16113 



Reference No. 2750-942P 





Intr 


16374 


16279 




0 




Intr 


16632 


16519 




0 




Intr 


16870 


16737 




0 




Init 


17303 


16971 


_ 


0 


5 














>4757403 


738239 








len = 


2056 


nex = 


7 




10 


Term 


15873 


15307 




0 




Intr 


16028 


15960 


- 


0 




Intr 


16200 


16113 




0 




Intr 


16374 


16279 


- 


0 




Intr 


16632 


16519 




0 


15 


Intr 


16870 


16737 




0 




Init 


17362 


16971 


_ 


0 




>4757403 


/19537 






20 


len = 


1552 


nex = 


4 






Term 


19050 


18584 




0 




Intr 


19389 


19285 


_ 


0 




Intr 


19620 


19513 


_ 


0 


25 


Init 


20135 


19978 


- 


0 




>4757403 


724666 








len = 


1611 


nex = 


2 




30 














Init 


24264 


25305 




0 






25397 


25874 


+ 


0 




>4757403 


739358 






35 














len = 


2877 




8 






Term 


26486 


25989 


- 


0 




Intr 


26777 


26718 




0 


40 


Intr 


26988 


26867 


- 


0 




Intr 


27249 


27072 




0 






27624 


27359 




0 




Intr 


28030 


27742 




0 




Intr 


28252 


28115 


- 


0 


45 


Init 


28406 


28339 




0 




>4757403 


739534 








len = 


554 


nex = 


1 




50 














Sngl 


32923 


32370 




0 




>4757403 


77337 






55 


len = 


503 


nex = 


1 






Sngl 


35283 


34781 




0 



>4757403 

60 



740618 



Reference No. 2750-942P 



1967 nex ■■ 



Init 
Intr 
Intr 
Intr 
Intr 
Term 

>4757403 

len = 

Sngl 

>4757404 

len = 

Sngl 

>4757405 

len = 

Term 
Intr 
Intr 
Intr 
Intr 
Intr 
Init 

>4757405 

Sngl 
>4757405 

Sngl 

>4757405 

len = 

Term 
Intr 
Intr 
Intr 
Intr 
Intr 
Init 



51540 51679 

51794 51843 

52329 52391 

52696 52786 

52894 52998 

53300 53506 

738488 

610 nex = 

77818 78427 

/103083 

143 nex = 

32868 33010 

/36106 

22 69 nex = 

8267 7900 

8590 8359 

8829 8685 

9156 8911 

9449 9249 

9730 9637 

10168 9818 

/20648 

518 nex = 

39529 39012 

/94317 

3 97 nex = 

41006 40610 

/100878 

1572 nex = 

45130 44797 

45395 45330 

45558 45474 

45726 45633 

45884 45827 

46200 46160 

46368 46277 



/5385 



60 len = 



599 nex = 



1 



Reference No. 2750-942P 





Sngl 


47192 


46605 




0 




>4757405 


/40773 






5 














len = 


598 


nex = 


1 






Sngl 


47192 


46606 


_ 


0 


10 


>4757405 


/30812 








len = 


1544 


nex = 


6 






Term 


52068 


51817 




0 


15 


Intr 


52294 


52205 


- 


0 




Intr 


52456 


52409 




0 




Intr 


52724 


52555 




0 




Intr 


53078 


53000 


- 


0 




Init 


53360 


53196 




0 


20 














>4757405 


/16619 








len = 


761 


nex = 


2 




25 


Init 


56271 


56389 


+ 


0 






56759 


57031 


+ 


0 




>4757405 


/3049 






30 


len = 


2274 


nex = 


5 






Init 


56278 


57300 


+ 


0 




Intr 


57384 


57484 


+ 


0 




Intr 


57586 


57669 


+ 


0 


35 


Intr 


57761 


57871 




0 




Term 


58094 


58551 


+ 


0 




>4757406 


/19244 






40 


len = 


1177 


nex = 


2 






Init 


25061 


25439 


+ 


0 






25638 


26237 


+ 


0 


45 


>4757406 


/35710 








len = 


1589 




8 






Init 


31143 


31329 


+ 


0 


50 


Intr 


31449 


31591 


+ 


0 




Intr 


31679 


31786 


+ 


0 




Intr 


31888 


31932 


+ 


0 




Intr 


32002 


32067 


+ 


0 




Intr 


32152 


32212 


+ 


0 


55 


Intr 


32296 


32342 


+ 


0 




Term 


32424 


32731 


+ 


0 




>4757406 


/97340 






60 


len = 


2530 


nex = 


5 





Reference No. 2750-942P 



Init 
Intr 
Intr 

5 Intr 
Term 

>4757407 

10 len = 

Init 
Term 

15 >4757407 
len = 
Term 

20 Intr 
Intr 
Intr 
Init 

25 >4757407 
len =_ 
Sngl 
>4757407 
len = 



30 



35 



50 



Term 
Intr 
Intr 
Intr 
Init 

>4757407 

len = 

Init 
Intr 
Term 

>4757407 



len = 
Sngl 

55 >4757407 
len = 



33489 33681 

34354 34432 

34531 34598 

34773 34839 

35246 35497 

/145523 

522 nex = 

14432 14663 
14777 14953 

/16944 

1228 nex = 

35221 34886 

35427 35369 

35569 35507 

35739 35676 

36113 35815 

/18624 

562 nex = 

42900 43461 

/19127 

202 5 nex = 

44665 44410 

44838 44752 

45182 44922 

45743 45338 

46434 45860 

/41161 

1106 nex = 

5273 5534 
5627 5929 
6019 6378 

/11077 
466 nex = 
6030 6495 

/111727 
3141 nex = 



Init 75926 76044 
60 Intr 76156 76326 



Reference No. 2750-942P 



intr 
Intr 
Term 

>4757407 

len = 

Term 
Intr 
Intr 
Init 

>4757407 

Len = 

Term 
Intr 
Intr 
Intr 
Init 

>4757409 

len = 

Init 
Intr 
Term 

>4757409 

len = 

Term 
Init 

>4757410 

len = 

Init 
Intr 
Intr 
Intr 
Term 

>4757410 
len = 
Sngl 

>4757410 
len = 



76407 76463 
76541 76605 
76716 76840 



1893 nex ^ 



77674 
77970 
78287 
79206 



77314 
77850 
78047 
78928 



/26596 
1953 nex 



82383 
82758 
83472 
83776 
84049 



82097 
82519 
83378 
83689 
84002 



/6166 

2028 nex = 

2893 3040 
3127 3389 
3462 4256 

/17962 

7 90 nex = 

4534 4275 
5064 4726 

/13767 

2170 nex = 



11323 
11799 
12035 
12453 
12656 



11581 
11940 
12088 
12544 
12882 



/10427B 
970 nex = 
14335 13368 
/5180 
626 nex = 



Init 37202 37397 
60 Term 37493 37827 
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>4757410 
len = 



/92670 
1849 nex 



Term 
Init 



42000 41613 
43461 43210 



>4757410 
len = 
Sngl 

>4757410 
len = 



729464 
8 95 nex = 
50057 50951 
73819 
2093 nex = 



Init 
Term 



59901 60514 
60601 60838 



>4757410 
len = 



734479 
1298 nex 



Init 
Intr 
Term 



59369 59822 
59901 60514 
60601 60666 



>4757410 
len = 



723740 
13 03 nex 



Init 
Intr 
Term 

>4757410 



59614 59822 
59901 60514 
60601 60916 

736248 

616 nex = 



Init 
Term 



60322 60514 
60601 60918 



>4757410 
len = 



7101693 
2146 nex 



Init 
Intr 
Intr 
Intr 
Intr 
Term 

>4757410 

len = 



67945 
68845 
69127 
69364 
69538 
69773 



68152 
68971 
69273 
69444 
69683 
70080 



7123200 
284 nex 



60 Sngl 73051 72768 



0 



Reference No. 2750-942P 



>4757411 
len = 
Sngl 

>4757411 

len = 

Init 
Intr 
Intr 
Intr 
Term 

>4757411 

len = 

Term 
Intr 
Init 

>4757413 

len = 

Term 
Intr 
Intr 
Intr 
Intr 
Init 

>4757413 

len = 

Init 
Term 

>4757413 

len = 

Init 
Term 

>4757413 
len = 
Sngl 

>4757414 
len = 



/99062 

348 nex = 

14302 14649 

738286 

2413 nex = 

35325 35620 

36178 36431 

36689 36824 

37137 37198 

37290 37737 

/143077 

7 74 nex = 

48735 48361 
48922 48839 
49134 48982 

/39613 

1930 nex = 

10486 10044 

10673 10581 

10937 10794 

11196 11110 

11493 11268 

11973 11691 

/37104 

1352 nex = 

37253 37571 
37707 38604 

/37513 

1375 nex = 

37253 37571 
37707 38627 

/9683 

1290 nex = 

40182 39244 

794738 

731 nex = 



Reference No. 2750-942P 



Sngl 

>4757414 

len = 

Init 
Term 

>4757414 
len = 
Sngl 

>4757414 

len = 

Init 
Term 

>4757414 

len = 

Init 
Intr 
Term 

>4757414 

len = 

Sngl 

>4757414 

Init 
Intr 
Intr 
Intr 
Term 

>4757414 
len = 
Sngl 

>4757414 

len = 

Init 
Intr 
Intr 
Intr 



15443 14713 

/1185 

1812 nex = 

21364 21435 
21537 22038 

/113827 
443 nex = 
35589 36031 

/117748 

615 nex = 

41281 41606 
41845 41895 

/6463 

1403 nex = 

41281 41606 
41845 41895 
42583 42683 

723733 

1161 nex = 

43908 42748 

732274 

1690 nex = 

44451 44575 

44666 44762 

44860 44969 

45357 45431 

45541 45739 

717361 

167 nex = 

67827 67661 

721721 

4353 nex = 

79949 80134 

80262 80364 

80481 80545 

80685 80760 



Reference No. 2750-942P 



1529 



Intr 81080 81144 + 0 

Intr 81467 81590 + 0 

Intr 81701 81826 + 0 

Intr 81943 81993 + 0 

5 Intr 82107 82175 + 0 

Intr 82394 82502 + 0 

Intr 82646 82725 + 0 

Intr 82828 82910 + 0 

Intr 82992 83037 + 0 

10 Intr 83119 83293 + 0 

Term 83381 83643 + 0 

>4757415 /43010 

15 len = 2179 nex = 5 

Init 110 480 + 0 

Intr 969 1054 + 0 

Intr 1131 1436 + 0 

20 Intr 1538 1621 + 0 

Term 1706 2288 + 0 

>4757415 /33108 

25 len = 1990 nex = 5 

Term 18813 18525 - 0 

Intr 19074 18908 - 0 

Intr 19639 19580 - 0 

30 Intr 19870 19724 - 0 

Init 20507 20200 - 0 

>4757415 /35741 

35 len = 2319 nex = 7 

Term 21262 20642 - 0 

Intr 21516 21346 - 0 

Intr 21769 21603 - 0 

40 Intr 21961 21868 - 0 

Intr 22248 22040 - 0 

Intr 22485 22334 - 0 

Init 22960 22702 - 0 

45 >4757415 794723 

len = 910 nex = 3 

Term 8317 8042 - 0 

50 Intr 8742 8423 - 0 

Init 8946 8856 - 0 

>4757417 /12993 

55 len = 2378 nex = 8 

Init 57794 57865 + 0 

Intr 57948 58085 + 0 

Intr 58174 58272 + 0 

60 Intr 58373 58445 + 0 
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Intr 
Intr 
Intr 
Term 



58911 58999 

59128 59211 

59310 59369 

59441 59775 



>4757660 



/40272 



2254 



Init 
Intr 
Intr 
Intr 
Intr 
Intr 
Intr 
Term 



57533 
57794 
57948 
58174 
58373 
59128 
59310 
59441 



57711 
57865 
58085 
58272 
58445 
59211 
59369 
59786 



2683 



nex ' 



Init 
Intr 
Intr 
Intr 
Intr 
Intr 
Intr 
Term 



17650 
18075 
18244 
18448 
18662 
18812 
18993 
19524 



17772 
18155 
18332 
18522 
18726 
18883 
19430 
20051 



nex 



Init 
Intr 
Intr 
Intr 
Intr 
Term 



23805 
24164 
24293 
24566 
24733 
25086 



24076 
24212 
24419 
24636 
24805 
25372 



>4757660 
len = 
Sngl 

>4757660 
len = 



3045 3801 

/6179 
1773 nex = 



Term 
Intr 
Intr 
Intr 
Init 



38593 
39063 
39389 
39630 
40059 



38287 
38802 
39159 
39497 
39789 



>4757660 

60 



/31508 



Reference No. 2750-942P 



Term 41278 40970 
Init 41687 41426 



1534 



nex ■ 



Term 
intr 
Intr 
Intr 
Init 



41278 
41687 
42011 
42219 
42584 



41051 
41426 
41778 
42089 
42322 



Term 
Init 



45205 44865 
45691 45430 



>4757560 



len = 



1838 nex ■■ 



Term 
Intr 
Intr 
Intr 
Init 



45205 44861 

45691 45430 

46019 45783 

46265 46135 

46698 46390 



>4757660 
len = 



/30015 



Term 
Intr 
Intr 
Intr 
Init 



47388 47089 

47741 47486 

48105 47875 

48353 48223 

48967 48677 



>4757660 
len = 



1990 



Term 
Intr 
Intr 
Intr 
Init 



47388 47089 

47741 47486 

48105 47875 

48353 48223 

49075 48677 



>4757660 
len = 



734896 
1999 nex ■ 



Term 
Intr 
Intr 
Intr 



54992 54648 

55353 55098 

55706 55473 

55926 55796 



Reference No. 2750-942P 



Init 

>4757650 

len = 

Term 
Intr 
Intr 
Intr 
Intr 
Intr 
Init 

>4757660 

len = 
SngI 
>4757660 

len = 
Sngl 
>4757661 

len = 
>4757651 

len = 

Init 
Intr 
Intr 
Term 

>4757661 

len = 

Term 
Intr 
Init 

>4757661 

len = 

Init 
Intr 
Intr 
Intr 
Term 



56646 56224 
/13762 
1939 nex = 



69818 
69980 
70123 
70269 
70491 
70666 
71316 



69378 
69915 
70059 
70206 
70382 
70579 
71035 



72663 

589 nex = 

85423 84835 

/143795 

550 nex = 

9114 8573 

733846 

9 98 nex = 

718244 

1550 nex = 

20501 20937 

21063 21205 

21578 21697 

21792 22050 

712689 

182 0 nex = 

52653 52143 
53031 52754 
53962 53118 

737307 

1947 nex = 



79034 
79685 
79984 
80266 
80634 



79330 
79897 
80132 
80542 
80980 



6 0 len = 



1550 nex = 



3 



Reference No. 2750-942P 



Init 
Intr 
Term 

>4757651 

len = 

Term 
Init 

>4757662 

len = 

Sngl 

>4757662 

len = 

Sngl 

>4757662 

len = 

Term 
Init 

>4757662 
len = 
Sngl 

>4757662 

len = 

Term 
Init 

>4757662 
len = 
Sngl 

>4757662 

len = 

Init 
Intr 
Intr 
Intr 
Intr 
Term 



88004 88300 
88888 89103 
89479 89553 

729823 

1270 nex = 

95428 95219 
96487 96281 

/151497 

374 nex = 

105041 105406 

733355 

7 64 nex = 

105041 105804 

710525 

959 nex = 

107176 106447 
107405 107251 

740979 

2536 nex = 

11058 12235 

73858 

883 nex = 

127589 127180 
128062 127787 

725577 

641 nex = 

15911 16551 

713756 

2333 nex = 

26705 26948 

27713 27814 

27911 28038 

28342 28470 

28565 28632 

28747 29037 



Reference No. 2750-942P 



10 



Term 
Intr 
Intr 
Init 

>4757662 



1931 



nex 



35291 34651 

35971 35763 

36176 36076 

36581 36264 

/206217 



len = 
15 Sngl 
>4757662 



51723 51806 
/41103 



20 



len = 

Term 
Intr 
Intr 
Intr 
Init 



2145 



nex 



55687 54683 

55901 55794 

56205 56003 

56423 56280 

56827 56682 



/10861 



1789 nex 



Term 
Intr 
Intr 
Intr 

35 Init 
>4757662 
len = 

40 

Term 
Intr 
Intr 
Intr 

4 5 Intr 
Intr 
Intr 
Intr 
Intr 

50 Intr 
Intr 
Intr 
Intr 
Intr 

55 Intr 
Init 



73379 73086 

73578 73468 

73781 73664 

73905 73869 

74874 74703 

/12880 

4180 nex = 



75805 
76039 
76358 
76502 
76821 
77212 
77417 
77706 
77866 
78048 
78257 
78415 
78718 
78917 
79200 
79624 



75445 
75900 
76118 
76452 
76596 
77142 
77298 
77512 
77802 
77958 
78196 
78365 
78617 
78810 
79137 
79320 



>4757662 



60 len = 



587 nex = 



1 



Reference No. 2750-942P 



Sngl 

>4757678 

len = 

Sngl 

>4757678 

len = 

Init 
Intr 
Intr 
Intr 
Intr 
Intr 
Intr 
Term 

>4757678 

len = 

Sngl 

>4757678 

len = 

Sngl 

>4757686 

len = 

Term 
Intr 
Intr 
Intr 
Intr 
Init 

>4757688 

len = 

Term 
Intr 
Intr 
Init 

>4760247 

len = 



97501 97352 
/17521 
808 nex = 
1711 904 

/1241 
1888 nex = 



53235 
53822 
54052 
54206 
54450 
54631 
54792 
55025 



53440 
53963 
54113 
54317 
54543 
54700 
54930 
55122 



/6028 
675 nex = 
62812 62712 
796337 
910 nex = 
85389 85713 
/24090 
1474 nex = 



49322 
49490 
49666 
49814 
50155 
50294 



49010 
49428 
49586 
49770 
50066 
50238 



/117787 

1051 nex = 

32809 32597 

33059 32975 

33518 33485 

33647 33604 

/123742 

92 9 nex = 



Term 16228 15990 
60 Intr 16403 16313 
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Intr 
Intr 
init 



16601 16514 
16705 16680 
16918 16792 



5 >4760247 
len = 



2251 



nex ■ 



Init 
Intr 
Intr 
Intr 
Intr 
Term 

>4760247 

len = 

Sngl 

>4760247 



1746 2458 

2735 2837 

2918 3021 

3285 3376 

3462 3559 

3647 3996 

/102245 

46 5 nex = 

1766 2230 

/16412 



len 



2700 



Term 
Intr 
Intr 
Intr 
Intr 
Intr 
Intr 
Intr 
Intr 
Intr 
Intr 
Init 



16228 
16403 
16601 
17018 
17212 
17537 
17721 
17870 
18036 
18186 
18460 
18689 



15990 
16313 
16514 
16792 
17097 
17282 
17634 
17799 
17928 
18128 
18397 
18561 



Term 
Intr 
Intr 
Intr 
Init 



2304 nex = 

6330 6048 

7503 7393 

7682 7577 

7899 7771 

8351 7975 

733299 



2250 



nex ■ 



Term 
Intr 
Intr 
Intr 
Intr 
Intr 
Intr 
Intr 



21949 
22229 
22409 
22653 
22805 
23090 
23430 
23651 



21710 
22137 
22356 
22513 
22740 
22889 
23376 
23590 
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Init 

>4760411 

len = 

Init 
Intr 
Intr 
Term 

>4760411 

len = 

Sngl 

>4760411 

len = 

Sngl 

>4760411 

len = 

Term 
Intr 
Intr 
Intr 
Intr 
Intr 
Intr 
Init 

>4760411 

len = 

Term 
Intr 
Intr 
Intr 
Intr 
Init 

>476041I 

len = 

Sngl 

>4761801 

len = 

Sngl 



23959 23734 

/16491 

1975 nex = 

4186 4655 

4887 5035 

5129 5239 

5334 6160 

/120348 

653 nex = 

45700 45048 

/31883 

152 nex = 

6057 6208 

736997 

2050 



67189 
67456 
67716 
67946 
68253 
68522 
68705 
69013 



nex = 

66970 
67326 
67588 
67800 
68062 
68412 
68610 
68843 



/30022 
3315 



73699 
74021 
74270 
74416 
76009 
76608 



nex = 

73294 
73925 
74140 
74358 
75544 
76282 



/38193 
14 7 4 nex = 
77039 78512 
/93707 
619 nex = 
143737 143119 



60 >4761801 



/115966 
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len 



649 



nex ■ 



Sngl 151005 151653 

>4761801 /10618 

len = 1736 nex = 

Init 168932 169123 

Intr 169266 169370 

Intr 169478 169788 

Intr 169866 170051 

Term 170355 170667 



>4761801 
len = 



/30044 



2558 



nex ■ 



Init 
Intr 
Intr 
Intr 
Term 



26588 27075 

27771 27904 

28035 28462 

28545 28676 

28863 29145 



>4761801 
len = 
Sngl 

>4761801 
len = 



/103581 



1859 



nex 



Init 
Intr 
Intr 
Intr 
Intr 
Term 



46090 
46495 
46666 
46946 
47328 
47528 



46263 
46564 
46725 
47060 
47443 
47948 



>4761801 
len = 



/27650 
643 nex = 



Init 
Term 



47359 47443 
47528 47991 



>4761801 
len = 



4450 



Term 
Intr 
Intr 
Intr 
Intr 
Intr 
Intr 



83154 
83294 
83538 
83672 



84594 
85379 



82922 
83234 
83451 
83607 
83988 
84487 
85289 
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Intr 
Intr 
Intr 
Intr 
Intr 
Intr 
Init 

>4775266 

Init 
Term 

>4775266 

len = 

Init 
Intr 
Intr 
Intr 
Intr 
Intr 
Intr 
Intr 
Term 

>4775266 
len = 
Sngl 

>4775266 

len = 

Term 
Intr 
Intr 
Init 

>4775266 

len = 

Term 
Init 

>4775266 

len = 

Sngl 

>4775266 



85690 
85964 
86148 
86290 
86558 
87054 
87364 



85604 
85841 
86057 
86229 
86461 
86904 
87194 



17383 17754 
18006 18793 



/20805 
2816 nex 



19777 
20186 
20536 
20819 
21162 
21579 
21927 
22133 
22327 



19920 
20444 
20742 
21059 
21486 
21842 
22048 
22252 
22592 



/20161 

557 nex = 

25109 24553 

/33023 

1956 nex = 

27412 26849 

27825 27714 

28452 28152 

28804 28549 

/10077 

715 nex = 

38914 38632 
39346 39105 

/18040 

801 nex = 

44171 44971 

/7221 



60 len = 



1630 nex = 



6 
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Term 
Intr 
Intr 
Intr 
Intr 
Init 

>4803878 
len = 
Sngl 

>4803878 

len = 

Term 
Intr 
Init 

>4803878 

len = 

Term 
Intr 
Init 

>4803878 

len = 

Term 
Intr 
Intr 
Init 

>4803878 
len = 
Sngl 

>4803878 

len = 

Term 
Intr 
Intr 
Init 

>4803878 

len = 



4598 4139 

4777 4686 

4950 4865 

5215 5035 

5520 5300 

5766 5592 

74897 

5 74 nex = 

22484 21911 

739544 

1063 nex = 

38379 37933 
38854 38542 
38995 38931 

737896 

1436 nex = 

38379 37680 
38854 38542 
39115 38931 

714132 

1582 nex = 

38379 37679 

38854 38542 

39082 38931 

39260 39188 

74936 

537 nex = 

45234 44698 

72725 

1431 nex = 

51078 50648 

51328 51226 

51702 51580 

52078 51914 

710780 

1615 nex = 



Init 52264 52570 
60 Intr 52981 53054 
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Term 



53516 



53878 



10 



>4803878 /40875 

5 len = 817 nex = 

Sngl 65721 66537 

>4803878 /117183 

len = 1607 nex = 



Init 73114 73395 

Intr 73476 73685 

Intr 74219 74295 

Term 74422 74720 



>4803878 

len = 

Term 
Intr 
Intr 
Init 



/19202 

1552 nex = 

81492 81413 

81857 81669 

82005 81947 

82964 82666 

729968 



1375 nex 



Term 
Intr 
Intr 
Intr 
Init 



83510 
83688 
83874 
84042 
84361 



83314 
83599 
83772 
83962 
84132 



Init 
Intr 
Intr 
Intr 
Intr 
Intr 
Intr 
Intr 
Intr 
Term 



2506 

88965 
89749 
89927 
90054 
90200 
90345 
90639 
90777 
90945 
91150 



nex = 

89156 
89846 
89969 
90115 
90256 
90380 
90711 
90848 
91048 
91470 



>4803878 



/30884 
1120 nex ■ 



Term 
Intr 
Intr 
Intr 
Init 



91871 91785 

92048 91974 

92193 92137 

92413 92291 

92898 92778 
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Term 


91871 


91565 




0 


Intr 


92048 


91974 




0 


Intr 


92193 


92137 




0 


Intr 


92413 


92291 




0 


Init 


92980 


92778 




0 



len = 



1433 



Term 
Intr 
Intr 
Intr 
Init 



91871 
92048 
92193 
92413 
93029 



91597 
91974 
92137 
92291 
92778 



>4803909 



len = 

25 

Init 
Intr 
Intr 
Intr 

3 0 Intr 
Intr 
Term 



2548 nex = 

25320 25450 

25545 25674 

25969 26075 

26572 26706 

26804 25907 

27030 27130 

27226 27508 



>4803909 



/101918 



2004 nex 



Init 
Intr 
Intr 
Intr 
Intr 
Intr 
Term 



25331 25450 

25545 25674 

25969 26075 

26572 26706 

26804 26907 

27030 27130 

27226 27334 



>4803909 
len = 



735385 
1648 nex ■■ 



5 0 Init 
Term 



34510 35420 
35503 36157 



>4803909 
55 len = 



199 9 nex 



Term 
Intr 
Intr 

6 0 Intr 



74683 74273 

75005 74768 

75269 75203 

75560 75398 
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Init 

>4803909 

len = 

Sngl 

>4803919 

len = 

Sngl 

>4803919 

len = 

Term 
Intr 
Init 

>4803919 
len = 
Sngl 

>4803919 

len = 

Term 
Init 

>4803919 

len = 

Sngl 

>4803919 

len = 

Sngl 

>4803919 

len = 

Init 
Intr 
Intr 
Intr 
Intr 
Intr 
Intr 
Intr 
Term 



76271 75939 

/14382 

572 nex = 

98753 98182 

/7141 

470 nex = 

10681 11150 

/20090 

638 nex - 

11473 11254 
11717 11664 
11891 11797 

/19643 

515 nex = 

18105 18619 

79942 

628 nex = 

18385 18182 
18466 18413 

/42149 

503 nex = 

18515 18013 

737778 

57 0 nex = 

19660 20229 

/39351 

2416 nex = 

21856 22144 

22554 22580 

22672 22741 

22863 22986 

23146 23230 

23321 23453 

23544 23717 

23785 23867 

23953 24271 
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2 0 62 nex ■■ 



Term 
intr 
Intr 
Intr 
Intr 
Intr 
Intr 
Init 



24902 
25064 
25345 
25530 
25707 
25854 
26112 
■ 26720 



24659 
24993 
25142 
25453 
25619 
25804 
25945 
26568 



15 >4803919 
len = 



2750 



30 



Term 
Intr 
Intr 
Intr 
Intr 
Intr 
Intr 
Intr 
Intr 
Intr 
Init 

>4803919 



35 Sngl 
>4803919 



40 



len = 

Init 
Intr 
Intr 
Intr 
Intr 
Intr 
Intr 
Intr 
Term 



27185 
27343 
27502 
27642 
27860 
28170 
28344 
28920 
29072 
29223 
29717 



26968 
27269 
27437 
27589 
27729 
28094 
28266 
28805 
29008 
29158 
29489 



2799 

40435 
41203 
41359 
41521 
41788 
42010 
42474 
42653 
42877 



nex = 

40910 
41259 
41437 
41681 
41915 
42147 
42566 
42773 
43233 



/7133 
1106 nex = 



Term 
Init 



15958 
16398 



15293 
16136 



60 len = 



1810 nex = 



6 
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Init 
Intr 
Intr 
Intr 
Intr 
Term 

>4809270 
len = 
Sngl 

>4809270 

len = 

Init 
Intr 
Intr 
Intr 
Intr 
Intr 
Intr 
Intr 
Intr 
Intr 
Term 

>4809270 

len = 

Init 
Intr 
Intr 
Intr 
Term 

>4809271 

len = 

Term 
Init 

>4809271 

len = 

Sngl 

>4809271 

len = 

Sngl 



26325 
26751 
26913 
27129 
27488 
27698 



26534 
26841 
27026 
27407 
27601 
27770 



723558 
6 74 nex = 
40607 41280 
736972 
2847 nex = 



64059 
64238 
64402 
64578 
64827 
65192 
65390 
65694 
65818 
66016 
66195 



64119 
64297 
64477 
64743 
65082 
65307 
65610 
65728 
65917 
66118 
66599 



/20668 

107 5 nex = 

65522 65610 

65694 65728 

65818 65917 

66016 66118 

66195 66596 

/21905 

1099 nex = 

12979 12614 
13712 13449 

722359 

10 95 nex = 

13712 12618 

7848 

760 nex = 

33077 32318 



60 >4809271 



7102190 
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len = 
Sngl 
>4809271 
len = 
Sngl 
>4809271 
len = 
Sngl 
>4809294 
len = 



743 nex = 
33077 32335 
/21181 
34 4 nex = 
8636 8293 

/6687 
1406 nex = 
8731 7326 

/8411 
1440 nex = 



Init 
Intr 
Intr 
Term 



14573 14846 

15067 15169 

15295 15456 

15734 16012 



>4809294 
len = 



78986 
2265 ne 



Init 
Intr 
Intr 
Intr 
Intr 
Intr 
Term 



21367 
21594 
21897 
22122 
22349 
22682 
23003 



21498 
21784 
22034 
22258 
22514 
22921 
23631 



>4809294 
len = 



/33512 
1150 nex 



Term 

Init 



30857 30449 
31590 31115 



>4809295 

len = 

Term 
Intr 
Intr 
Intr 
Intr 
Init 

>4809296 

len = 



73676 
2083 nex 



53797 
53990 
54177 
54877 
55272 
55753 



53671 
53913 
54096 
54684 
55181 
55348 



730187 
1390 nex 
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Term 


20449 


20131 








Intr 


20937 


20890 


_ 


0 




Init 


21511 


21158 


_ 


0 


5 


>4809296 


/3071 








len = 


1092 


nex = 


4 






Init 


411 


493 


+ 


0 


1 0 


Intr 


704 


768 








Intr 


971 


1041 


+ 


0 




Term 


1133 


1359 


+ 


0 




>4809296 


/34078 






15 














len = 


1150 


nex = 


4 






Init 


411 


493 


+ 


0 




Intr 


704 


768 


+ 


0 


20 


Intr 


971 


1041 


+ 


0 




Term 


1133 


1416 


+ 


0 




>4835223 


/36702 






25 


len = 


2050 


nex = 


8 






Term 


28902 


28573 




0 




Intr 


29030 


28977 




0 




Intr 


29207 


29113 


- 


0 


30 


Intr 


29404 


29307 




0 




Intr 


29625 


29483 


- 


0 




Intr 


29839 


29708 




0 




Intr 


30116 


29988 


- 


0 




Init 


30618 


30194 




0 


35 














>4835223 


/3286 








len = 


562 




1 




40 


Sngl 


40069 


39514 


_ 


0 




>4835223 


/B864 








len = 


1585 


nex = 


4 




45 














Term 


42999 


42518 


- 


0 




Intr 


43254 


43184 




0 




Intr 


43459 


43352 




0 




Init 


44102 


43814 


_ 


0 


50 














>4835223 


/153949 








len = 


1900 


nex = 


6 




55 


Init 


50935 


51051 


+ 


0 




Intr 


51154 


51196 


+ 


0 




Intr 


51320 


51381 


+ 


0 




Intr 


51479 


51575 


+ 


0 




Intr 


52119 


52189 


+ 


0 


60 


Term 


52348 


52439 


+ 


0 



Reference No. 2750-942P 



>4835223 

len = 

Init 
Term 

>4835223 

len = 

Sngl 

>4835223 

>4835223 

len = 

Sngl 

>4835223 

len = 

Sngl 

>4835223 

len = 

Sngl 

>4835223 

len = 

Term 
Intr 
Intr 
Intr 
Intr 
Intr 
Intr 
Init 

>4835773 

len = 

Sngl 

>4835773 

len = 

Init 



/27213 

1570 nex = 

59627 60267 
60638 61187 

7732 

1474 nex = 

62121 62953 

/20695 

1163 nex = 

/23377 

790 nex = 

65775 64987 

/10044 

495 nex = 

66699 67193 

/32364 

1750 nex = 

78553 80295 

/5193 

1348 nex = 

87820 87718 

87968 87907 

88132 88060 

88361 88219 

88494 88447 

88643 88569 

88863 88780 

89065 88950 

/1989 

690 nex = 

20133 20822 

/34385 

2830 nex = 

2024 2254 
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Intr 
Intr 
Intr 
Intr 
Intr 
Intr 
Intr 
Intr 
Intr 
Intr 
Term 



2371 
2617 
2833 
3023 
3363 
3582 
3767 
3929 
4116 
4369 
4602 



2450 
2733 
2936 
3089 
3496 
3630 
3838 
4030 
4183 
4521 
4852 



1946 nex 



Init 
Intr 
Intr 
Intr 
Intr 
Intr 
Term 

>4835773 

len = 

Term 
Intr 
Init 

>4835773 

len = 

Init 
Intr 
Intr 
Term 

>4836442 
len = 
Sngl 

>4836906 

len = 

Term 
Intr 
Intr 
Init 



42427 
42604 
42803 
42991 
43278 
43797 
43997 



42503 
42707 
42903 
43094 
43715 
43928 
44372 



/28214 

1757 nex = 

48583 47836 
48864 48677 
49592 48965 

/21943 

1870 nex = 

5082 5505 

5583 5729 

5834 6043 

6473 6945 

/1137 
730 nex = 
5764 6489 

/93312 
1831 nex = 



26269 
26453 
26664 
27844 



26014 
26358 
26531 
26730 



len 

60 



831 nex = 



1 
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Sngl 

>4836906 

len = 

Term 
Init 



64559 63729 

/14139 

157 0 nex = 

80467 80066 
80887 80566 



>4836906 

len = 

Init 
Term 

>4850281 
len = 
Sngl 

>4850409 

len = 

Term 
Init 

>4850409 

len = 



86401 86906 
87082 87298 

/4143 
436 nex = 
107876 108311 

/12642 

583 nex = 

104024 103684 
104266 104150 

/20585 

730 nex = 



Init 
Term 

>4850409 
len = 
Sngl 

>4850409 

len = 

Init 
Intr 
Intr 
Intr 
Term 

>4850409 

len = 

Term 
Intr 
Intr 
Intr 



106425 106547 
106829 107153 

/28536 

296 nex = 

106829 107124 

/39306 

2314 nex = 



14095 
14334 
14590 
14895 
15124 



14194 
14483 
14863 
14982 
15585 



/3062 

2530 nex = 

18950 18748 

19135 19019 

19377 19225 

19637 19467 
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Intr 
Intr 
init 

>4850409 
len = 

>4850409 
len = 
Sngl 

>4850409 

len = 

Init 
Intr 
Intr 
Term 

>4850409 
len = 
Sngl 

>4850409 

len = 

Term 
Intr 
Intr 
Init 

>4850409 

len = 

Term 
Init 

>4850409 

len = 

Term 
Init 

>4850409 

len = 

Sngl 



19900 19721 
20082 19984 
20494 20407 

/4791 

736 nex = 

727686 

716 nex = 

23412 24127 

/30722 

16 77 nex = 

34646 34737 

34898 35330 

35419 35472 

35528 35913 

794765 

235 nex = 

35586 35820 

740551 

1848 nex = 

62292 61937 

62986 62451 

63104 63016 

63784 63510 

7112273 

850 nex = 

66692 66564 
66880 66784 

75724 

1295 nex = 

66692 66532 
66880 66784 

72548 

6 92 nex = 

67705 68396 



>4B50409 

60 



78711 



Reference No. 2750-942P 



len = 

Sngl 

>4850409 

len = 

Sngl 

>4850409 

len = 

Sngl 

>4850409 

len = 

Sngl 

>4850409 

len = 

Sngl 

>4850409 

len = 

Sngl 

>4850411 

len = 

Sngl 

>4874280 

len = 

Init 
Intr 
Intr 
Term 

>4874280 

len = 

Sngl 

>4874280 

len = 

Term 



708 nex = 

67705 68412 

/2535 

598 nex = 

69019 69616 

/30316 

79 0 nex = 

76323 75925 

/7539 

791 nex = 

76452 75925 

74289 

1643 nex = 

90019 91371 

727754 

1431 nex = 

89946 91376 

721257 

43 0 nex = 

14482 14057 

712402 

1492 nex = 

100913 101010 

101624 101737 

101836 101915 

102002 102400 

713543 

670 nex = 

110726 110057 

729810 

1717 nex = 

115803 115457 
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Intr 


116038 


115976 


0 


Intr 


116268 


116158 


0 


Intr 


116431 


116349 


0 


Intr 


116759 


116512 


0 


Intr 


116980 


116882 


0 


Init 


117173 


117074 


0 



>4874280 

len = 

Sngl 

>4874280 

len = 

Sngl 

>4874280 

len = 

Term 
Intr 
Intr 
Intr 
Intr 
Init 

>4874280 
len = 
Sngl 

>4874280 

len = 

Term 
Intr 
Intr 
Init 

>4874280 
len == 
Sngl 

>4874280 

len = 

Term 
Intr 
Intr 
Init 



/14469 
146 nex = 
25949 25804 
/1871 
1210 nex = 
37446 36737 
/25201 
1044 



43874 
44112 
44365 
44512 
44684 
44864 



nex = 

43821 
43996 
44198 
44447 
44589 
44764 



/10068 
740 nex = 
57061 56322 
/11912 
772 nex = 



62694 
62885 
63127 
63285 



62514 
62784 
62992 
63202 



/30239 

758 nex = 

63635 64392 

/22166 

1400 nex = 

64736 64526 

65314 65273 

65723 65668 

65925 65828 
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>4874280 
len = 
5 Sngl 

>4874280 
len = 

10 

Init 
Term 



Term 
Intr 

20 Init 



>4874280 
len = 



Term 
Init 



len = 

Term 
Intr 

35 intr 

Intr 
Init 

>4874280 

40 

len = 
Sngl 

45 >4874280 
len = 
Sngl 

50 

>4874281 
len = 
55 Sngl 

>4878038 
len = 

60 



78886 

271 nex = 

83252 82982 

/21847 

1253 nex = 

88169 88958 
89032 89421 

/19135 

5291 nex = 

86468 86081 
87158 86630 
91371 91235 

/16209 

1870 nex = 

90809 89582 
91445 91235 

79177 

1603 nex = 

92357 92087 

92498 92429 

92677 92587 

93017 92768 

93689 93526 

717356 

1051 nex = 

97810 98457 

7121746 

752 nex = 

97770 98521 

718634 

992 nex = 

28766 29757 

733093 

1838 nex = 



Reference No. 2750-942P 



Init 
Intr 
Intr 
Term 

>4878038 

len = 

Init 
Intr 
Intr 
Term 

>4878038 
len = 
Sngl 

>4878038 

len = 

Term 
Intr 
Init 

>487803e 

len = 

Term 
Intr 
Init 

>4878038 

len = 

Term 
Intr 
Init 

>4878038 

len = 

Term 
Intr 
Init 

>4878038 

len = 



17045 17083 

17496 17715 

17935 18104 

18283 18716 

/12279 

1657 nex = 

17042 17083 

17496 17715 

17935 18104 

18283 18698 

/41419 

101 nex = 

1922 1822 

/103167 

827 nex = 

22122 21766 
22412 22340 
22592 22509 

/12044 

8 32 nex = 

22122 21767 
22412 22340 
22598 22509 

/7414 

85 0 nex = 

22122 21767 
22412 22340 
22611 22509 

739977 

915 nex = 

22122 21728 
22412 22340 
22642 22509 

/18748 

635 nex = 



60 



Term 
Intr 
Init 



22122 22008 
22412 22340 
22642 22509 
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>4878038 

len = 

Term 
Intr 
Init 

>4878038 

len = 

Init 
Intr 
Term 

>4878038 
len = 
Sngl 

>4878038 

len = 

Term 
Intr 
Intr 
Init 

>4878038 

len = 

Init 
Intr 
Term 

>4878038 

len = 

Init 
Intr 
Intr 
Intr 
Term 

>4878038 

len = 

Init 
Intr 
Term 



734623 

911 nex = 

22122 21766 
22412 22340 
22676 22509 

735553 

1034 nex = 

23065 23187 
23268 23447 
23838 24098 

740238 

1210 nex = 

2380 1177 

7107233 

643 nex = 

48867 48732 

49015 48983 

49222 49136 

49374 49316 

735098 

217 0 nex = 

53312 53618 
53769 54355 
55015 55476 

76413 

3 951 nex = 



56661 
58951 
59203 
59403 
59530 



56949 
59082 
59289 
59453 
60611 



7119807 

1461 nex = 

59122 59289 
59403 59453 
60354 60582 



60 len = 



1012 nex = 



3 
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Init 
Intr 
Term 

5 

>4878038 

len = 

10 Init 
Term 

>4878038 

15 len = 

Sngl 

>4878039 

20 

len = 

Init 
Term 

25 

>4878039 

len = 

3 0 Init 
Term 

>4878039 

35 len = 

Sngl 

>4878039 

40 

len = 
Sngl 

45 >4878039 
len = 
>4878039 

50 

len = 
Sngl 

55 >4878039 
len = 
Init 

60 Intr 



65143 65373 
65735 65837 
66104 66154 

/123625 

897 nex = 

69908 70155 
70242 70804 

/150460 

313 nex = 

75227 74915 

/27589 

1529 nex = 

21799 21871 
22351 22674 

/31701 

1491 nex = 

21799 21871 
22351 22658 

722285 

1224 nex = 

23812 23609 

/7311 

1244 nex = 

23835 23609 

/13410 

1243 nex = 

/11384 

534 nex = 

41117 40584 

/21735 

2391 nex = 

6076 6252 
6656 6778 



1557 

+ 0 
+ 0 
+ 0 

2 

+ 0 
+ 0 

1 

0 

2 

+ 0 
+ 0 

2 

+ 0 
+ 0 

1 

0 

1 

0 

0 

1 

0 

6 

+ 0 
+ 0 



Reference No. 2750-942P 



Intr 
Intr 
Intr 
Term 

>4878039 

len = 

Init 
Intr 
Intr 
Intr 
Intr 
Term 

>4878039 

len = 

Init 
Intr 
Intr 
Intr 
Intr 
Term 

>4878039 

len = 

Sngl 

>4878039 

len = 

Sngl 

>4878039 

len = 

Sngl 

>4878039 

len = 

Term 
Intr 
Intr 
Init 

>4878039 

len = 



6934 7050 

7141 7336 

7584 7698 

7795 7959 

/153670 

2383 nex = 



6084 
6656 
6934 
7141 
7584 
7795 



6252 
6778 
7050 
7336 
7698 
7959 



/118900 
2308 nex 



6091 
6656 
6934 
7141 
7584 
7795 



6252 
6778 
7050 
7336 
7698 
7959 



/17026 

156 nex = 

61886 61731 

724663 

334 nex = 

61970 61637 

/15897 

458 nex = 

62053 61601 

79325 

13 6 6 nex = 

62075 61726 

62381 62331 

62704 62472 

63091 62815 

733019 

1400 nex = 



Term 62075 61700 
60 Intr 62381 62331 
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Intr 
Init 

>4878039 
len = 
Sngl 

>4878039 
len = 

>4878039 

len = 

Term 
Intr 
Intr 
Init 

>4878039 

len = 

Init 
Intr 
Term 

>4878039 

len = 

Term 
Intr 
Init 

>4878039 

len = 

Term 
Init 

>4883587 
len = 
Sngl 

>4883587 

len = 

Init 
Intr 
Intr 
Term 



62704 62472 
63099 62815 

/14650 

190 nex = 

63116 62933 

/41423 

1419 nex = 

733922 

1522 nex = 

62075 61601 

62381 62331 

62704 62472 

63122 62815 

78579 

813 nex = 

72885 73192 
73271 73362 
73447 73688 

/38014 

1795 nex = 

81714 81447 
81951 81802 
82973 82025 

/6888 

1128 nex = 

84567 84273 
85400 84860 

/206485 

538 nex = 

113867 113330 

/8904 

1657 nex = 

1398 1648 

2403 2594 

2697 2865 

2973 3054 
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>4883587 

len = 

Sngl 

>4883587 

len = 

Sngl 

>4883588 

len = 

Init 
Intr 
Term 

>4B83588 

len = 

Init 
Intr 
Term 

>4883588 

len = 

Init 
Term 

>4883588 

len = 

Term 
Intr 
Intr 
Intr 
Intr 
Intr 
Intr 
Intr 
Intr 
Init 

>4883589 
len = 
Sngl 

>4883595 
len = 



/99140 

612 nex = 

81836 81225 

/35520 

1890 nex = 

83089 81200 

/30158 

1058 nex = 

78085 78219 
78340 78388 
78702 79142 

/42159 

938 nex = 

78151 78219 
78340 78388 
78702 79088 

/117103 

956 nex = 

78187 78219 
78340 79142 

/12963 

2950 nex = 

79984 79379 

80178 80092 

80331 80272 

80484 80416 

80729 80573 

80924 80848 

81328 81283 

81508 81422 

81730 81584 

82321 81955 

/39471 

910 nex = 

27167 26259 

/27178 

1305 nex = 
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Init 
Intr 
Intr 
Intr 
Term 



13666 13838 

14082 14123 

14233 14403 

14555 14641 

14739 14970 



728643 



1274 



nex 



Init 
Intr 
Intr 
Intr 
Term 



13690 13838 

14082 14123 

14233 14403 

14555 14641 

14739 14963 



727627 



1278 



nex ■■ 



Init 
Intr 
Intr 
Intr 
Term 



13690 13838 

14082 14123 

14233 14403 

14555 14641 

14739 14967 



>4883595 



7121982 



1275 



nex = 



Init 
Intr 
Intr 
Intr 
Term 



13695 13838 

14082 14123 

14233 14403 

14555 14641 

14739 14969 



>4883595 
len = 



725350 



1004 



nex ■ 



Init 
Intr 
Intr 
Intr 
Term 



13767 13838 

14082 14123 

14233 14403 

14555 14641 

14739 14764 



732137 



189 6 nex 



Init 
Intr 
Intr 
Intr 
Term 



38588 38819 

39455 39511 

39688 39878 

39967 40071 

40162 40483 



737242 



len = 

60 



1971 nex = 



5 
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Init 
Intr 
Intr 
Intr 
Term 

>4883595 

len = 

Init 
Intr 
Intr 
Intr 
Term 

>4883595 

len = 

Sngl 

>4883595 

len = 

Sngl 

>4883595 

Term 
Intr 
Intr 
Init 

>4883595 

len = 

Init 
Intr 

>4883595 
len = 
Sngl 
>4883598 
len = 
Sngl 
>4883598 
len = 



38588 38819 

39455 39511 

39688 39878 

39967 40071 

40162 40558 

/20760 

1990 nex = 



38780 
39455 
39688 
39967 
40162 



38819 
39511 
39878 
40071 
40576 



734869 

89 9 nex = 

38590 38819 

/25104 

37 2 nex = 

40223 40578 

/5109 

1376 nex = 

66464 66136 

66734 66627 

66906 66815 

67219 67018 

/113853 

912 nex = 

67833 68023 
68119 68397 
68518 68744 

/27387 

1095 nex = 

97707 96779 

/112395 

102 nex = 

22321 22422 

/27175 

1292 nex = 



Reference No. 



2750-942P 



Term 
Intr 
Intr 
5 Intr 
Intr 
Init 

>4883598 

10 

len = 

Init 
Intr 

15 Term 
>4883599 
len = 

20 

Init 
Intr 
Intr 
Intr 

25 Intr 
Term 

>4883599 

30 len = 

Sngl 

>4883599 

35 

len = 
Sngl 

40 >4883599 
len = 
Term 

4 5 Intr 
Intr 
Init 

>4883599 

50 

len = 

Term 
Intr 

55 Intr 
Init 

>4883599 

60 len = 



1563 



53471 53305 - 0 

53639 53577 - 0 

53815 53735 - 0 

53963 53919 - 0 

54304 54215 - 0 

54443 54387 - 0 

/31249 

1066 nex = 3 

75625 75768 + 0 

75862 75941 + 0 

76089 76422 + 0 

/41933 

24 7 0 nex = 6 

107705 108086 + 0 

108321 108417 + 0 

108494 108609 + 0 

109315 109359 + 0 

109639 109774 + 0 

109859 110173 + 0 

/19094 

1768 nex = 1 

118037 116270 - 0 

/11488 

696 nex = 1 

119284 118589 - 0 

/97088 

1795 nex = 4 

11068 10803 - 0 

11765 11697 - 0 

12260 12150 - 0 

12597 12343 - 0 

/7831 

2010 nex = 4 

34414 33824 - 0 

35311 35094 - 0 

35494 35394 - 0 

35833 35596 - 0 



/13320 
2015 nex = 1 



Reference No. 2750-942P 



Sngl 

>4883599 

len = 

Term 
Intr 
Intr 
Intr 
Intr 
Init 

>4883599 

len = 

Term 
Intr 
Intr 
Intr 
Init 

>4883599 

len = 

Term 
Intr 
Intr 
Init 

>4883599 

len = 

Init 
Term 

>4883599 

len = 

Init 
Term 

>4883599 

len = 

Init 
Intr 
Term 

>4883599 

len = 



42074 40567 

/29157 

1707 nex = 

43450 43184 

43605 43533 

43874 43750 

44036 43965 

44322 44225 

44890 44558 

/16221 

12 7 0 nex = 

50525 50232 

50711 50610 

50909 50793 

51085 50985 

51498 51268 

/2416 

1352 nex = 

52955 52650 

53295 53025 

53806 53522 

54001 53894 

736633 

1133 nex = 

6605 7052 
7144 7737 

/119288 

64 6 nex = 

6611 7052 
7144 7256 

/12558 

2034 nex = 

66457 66727 
66744 66880 
66958 68472 

/12259 

212 nex = 



60 Sngl 



67175 67386 



0 
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>4883599 

len = 

Term 
Intr 
Intr 
Init 

>4883599 

len = 

Init 
Intr 
Term 

>4883599 

len = 

Term 
Init 

>4884020 

len = 

Term 
Intr 
Init 

>4884020 

len = 

Term 
Intr 
Intr 
Intr 
Intr 
Intr 
Intr 
Init 

>4884020 

len = 

Init 
Intr 
Intr 
Term 

>4884021 

len = 

Init 



1599 



nex 



71283 71095 

71609 71547 

71898 71722 

72208 72146 

/19570 



76320 76853 
77103 77181 
77311 78033 



98968 98595 
99225 99034 



15439 14937 
15776 15562 
16497 16374 



2037 nex 



22090 
22315 
22607 
22782 
22906 
23079 
23444 
23627 



22010 
22187 
22491 
22718 
22867 
23017 
23183 
23517 



727539 

1335 nex = 

34674 35251 

35340 35450 

35592 35705 

35914 36008 

/2419 

1136 nex = 

33897 34058 
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1566 

Term 34433 35032 + 0 

>4884021 /19314 

5 len = 2613 nex = 2 

Init 3435 3738 + 0 

Term 3868 6047 + 0 

10 >4884022 72982 

len = 2050 nex = 8 

Init 25501 25640 + 0 

15 Intr 25886 26005 + 0 

Intr 26146 26292 + 0 

Intr 26381 26476 + 0 

intr 26560 26655 + 0 

Intr 26927 27044 + 0 

20 Intr 27132 27204 + 0 

Term 27301 27548 + 0 

>4884022 /40445 

25 len = 3512 nex = 12 

Term 28180 27923 - 0 

Intr 28424 28270 - 0 

Intr 28550 28502 - 0 

30 Intr 28909 28653 - 0 

Intr 29159 29030 - 0 

Intr 29337 29269 - 0 

Intr 29632 29579 - 0 

Intr 29897 29739 - 0 

35 Intr 30278 30141 - 0 

Intr 30434 30372 - 0 

Intr 30577 30538 - 0 

Init 30707 30661 - 0 

40 >4884023 794438 

len = 1247 nex = 3 

Init 32053 32253 + 0 

45 Intr 32590 32558 + 0 

Term 32956 33299 + 0 

>4884023 740485 

50 len = 2060 nex = 7 

Init 76282 76657 + 0 

Intr 76812 76874 + 0 

Intr 77011 77127 + 0 

55 Intr 77202 77448 + 0 

Intr 77547 77705 + 0 

Intr 77780 77931 + 0 

Term 78015 78341 + 0 



60 >4884023 



7110908 



Reference No. 2750-942P 



>4886265 

5 

len = 

Init 
Intr 

10 Intr 
Term 

>4886265 

15 len = 

Sngl 

>4886265 

20 

len = 
Sngl 

25 >4886265 
len = 
Sngl 

30 

>4886265 

len = 

35 Term 
Intr 
Intr 
Intr 
Init 

40 

>4886265 

len = 

4 5 Term 
Intr 
Intr 
Intr 
Intr 

50 Init 
>4886265 
len = 

55 

Init 
Term 



372 nex = 

/16125 

1665 nex = 

15167 15326 

15839 15978 

16219 16344 

16429 16831 

739625 

372 nex = 

45373 45744 

/11782 

910 nex = 

58085 58990 

/92178 

910 nex = 

60596 59692 

77969 

1105 nex = 

6613 6546 

6828 6699 

7010 6925 

7452 7407 

7643 7530 

79716 

1646 nex = 

6613 6377 

6828 6699 

7010 6925 

7452 7407 

7649 7530 

8022 7921 

737248 

1450 nex = 

84108 84785 
84958 85556 



>4886265 

60 



714246 



Reference No. 2750-942P 



len 



1696 nex ■■ 



Term 88532 88155 

Intr 88912 88604 

5 Intr 89088 88987 

Init 89850 89191 

>4887257 /145295 

10 len = 1695 nex = 

Term 10323 10083 

Intr 10562 10518 

Intr 10742 10642 

15 Init 11039 10834 



>4887257 

len = 

Term 
Intr 
Intr 
Init 



/22508 

1006 nex = 

23643 23309 

23922 23824 

24104 24023 

24314 24189 



/143886 



3 0 Init 
Intr 
Term 



69630 69796 
69886 69951 
70176 70475 



>4887257 

len = 

Term 
Intr 
Intr 
Intr 
Intr 
Intr 
Intr 
Intr 
Intr 
Intr 
Intr 
Init 



2950 

74958 
75114 
75239 
75402 
75574 
75705 
75982 
76205 
76563 
76761 
77059 
77646 



nex = 

74703 
75047 
75176 
75321 
75516 
75646 
75820 
76081 
76501 
76702 
76880 
77488 



2801 



nex 



Init 
Intr 
Intr 
Intr 
Intr 
Intr 



78250 
78410 
78598 
78737 
78900 
79275 



78326 
78474 
78644 
78812 
78967 
79344 



Reference No. 2750-942P 



Intr 
Intr 
Intr 
Term 

>4887257 

len = 

Init 
Intr 
Term 

>4887737 

len = 

Sngl 

>4887737 

len = 

Sngl 

>4887737 

len = 

Sngl 

>4887737 

len = 

Sngl 

>4887737 

len = 

Term 
Init 

>4B87737 

len = 

Term 
Init 

>4887738 
len = 
Sngl 

>4887738 
len = 



79628 79713 

79880 79942 

80237 80302 

80400 80609 

/109912 

1124 nex = 

94040 94098 
94128 94314 
94452 95163 

/2004 

675 nex = 

7907 8581 

/1505 

673 nex = 

7909 8581 

/31675 

599 nex = 

7913 8511 

/106951 

6 77 nex = 

7913 8589 

/108174 

1630 nex = 

85528 84876 
86500 86083 

733557 

1788 nex = 

85528 84753 
86540 86083 

723872 

764 nex = 

20581 19818 

7119765 

3139 nex = 



Reference No. 2750-942P 



Init 


3396 


4329 


+ 


0 


Intr 


4533 


4587 


+ 


0 


Intr 


5001 


5039 


+ 


0 


Intr 


5122 


5185 


+ 


0 


Intr 


5549 


5576 


+ 


0 


Intr 


5741 


5832 




0 


Term 


6013 


6534 


+ 


0 



Init 
Intr 
Intr 
Intr 
Intr 
Intr 
Term 

>4887738 

len = 

Term 
Intr 
Init 

>4887738 

len = 

Init 
Intr 
Intr 
Term 

>4887738 

len = 



Term 
Init 

>4887738 
len = 
Sngl 

>4887740 

len = 

Init 
Intr 
Intr 
Intr 
Term 



313 6 nex = 

3399 4329 

4533 4587 

5001 5039 

5122 5185 

5549 5576 

5741 5832 

6013 6534 

/28177 

1570 nex = 

43839 43444 

44038 43918 

45006 44725 

/6612 

2489 nex = 

61698 61850 

61949 62099 

62198 62262 

62746 63244 

729992 

926 nex = 

78325 77774 

78699 78401 

/146828 

507 nex = 

83673 83167 

/33192 

2290 nex = 

199 824 

922 1089 

1185 1376 

1551 1619 

1985 2290 



Reference No. 2750-942P 



>4887740 
len = 

5 

Init 
Term 

>4887740 

10 

len = 
Sngl 

15 >4895147 
len = 
Init 

2 0 Intr 
Intr 
Intr 
Intr 
Intr 

2 5 Term 
>4895147 
len = 

30 

Term 
Intr 
Intr 
Init 

35 

>4895147 
len = 
4 0 Sngl 
>4B95147 
len = 

45 

Term 
Intr 
Init 

50 >4895147 
len = 
Term 

55 Intr 
Init 

>4895147 

60 len = 



723467 

521 nex = 

46831 46988 
47080 47203 

/12467 

651 nex = 

5943 5293 

/124537 

1961 nex = 

15249 15358 

15727 15850 

15938 15966 

16050 16115 

16202 16309 

16399 16464 

16795 16876 

/21739 

1469 nex = 

50700 50490 

51095 50898 

51286 51185 

51958 51514 

/21920 

44 4 nex = 

53076 52638 

737658 

1662 nex = 

53123 52698 
53993 53216 
54359 54077 

724367 

993 nex = 

55478 55257 
55772 55560 
56104 56024 

720633 

10 66 nex = 



Reference No. 2750-942P 



Term 
Intr 
Init 

5 

>4895147 

len = 

1 0 Term 
Init 

>4895147 

15 len = 

Term 
Init 

20 >4895147 
len = 
Term 

25 Init 
>4895147 
len = 

30 

Sngl 

>4895147 

35 len = 

Term 
Intr 
Init 

40 

>4895147 

len = 

4 5 Term 
Intr 
Init 

>4B95147 

50 

len = 
Sngl 

55 >4895147 
len = 
Term 

60 Intr 



55478 55258 
55772 55560 
56104 56024 

739533 

7 30 nex = 

55772 55601 
56104 56024 

74458 

740 nex = 

55772 55598 
56104 56024 

/20676 

1041 nex = 

55478 55286 
56104 56024 

7524 

1101 nex = 

56104 56024 

/6803 

979 nex = 

55478 55354 
55772 55560 
56104 56024 

715992 

1018 nex = 

55478 55318 
55772 55560 
56104 56024 

/24607 

550 nex = 

68923 68382 

/43068 

2470 nex = 

70358 70271 
70597 70472 
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Intr 
Intr 
Intr 
Intr 
Intr 
Intr 
Intr 
Init 

>4895147 

len = 

Init 
Intr 
Intr 
Intr 
Intr 
Term 

>4895147 

len = 

Init 
Intr 
Intr 
Term 

>4895147 

len = 

Sngl 

>4895147 

len = 

Sngl 

>4895148 

len = 

Term 
Init 

>4895148 
len = 
Sngl 

>4895148 
len = 



70793 
71005 
71301 
71483 
71698 
71890 
72038 
72731 



70704 
70880 
71248 
71403 
71591 
71819 
71976 
72465 



/20749 
2192 nex 



7453 
7862 
8068 
8583 
9110 
9343 



7722 
7924 
8184 
8988 
9261 
9644 



728975 

1253 nex = 

77491 77810 

78228 78304 

78428 78536 

78629 78743 

798397 

7 30 nex = 

79391 79138 

737526 

1066 nex = 

80755 81820 

714553 

1704 nex = 



39478 
40602 



38899 
39746 



7114523 
310 nex = 
55690 55388 
740329 
1066 nex = 



Term 55806 55404 
60 Init 56469 56070 
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>4895148 
len = 
Sngl 

>4895149 

len = 

Term 
Init 



/109156 

480 nex = 

57879 57400 

/37283 

1572 nex = 

20933 20082 
21653 21105 



>4895167 
len = 



Init 
Intr 
Term 



43380 43470 
43555 43730 
43825 44256 



>4895176 

len = 

Term 
Intr 
Intr 
Intr 
Intr 
Init 



2128 nex = 

104785 104678 

105146 104871 

105483 105396 

105659 105583 

106058 106008 

106328 106140 



>4895176 

len = 

Term 
Intr 
Intr 
Intr 
Intr 
Init 

>4895176 



/6630 

1966 nex = 

111441 111255 

112230 111955 

112394 112307 

112556 112480 

112906 112856 

113220 112995 

77663 

2007 nex = 



Term 
Intr 
Intr 
Intr 
Init 

>4895176 

len = 



111441 111237 

112230 111955 

112394 112307 

112556 112480 

112906 112856 

/40538 

2556 nex = 



Term 42176 41886 
60 Intr 42395 42282 



Reference No. 2750-942P 



Intr 42663 

Intr 42988 

Intr 43261 

Intr 43455 

Intr 43792 

Init 44441 



42480 
42871 
43133 
43404 
43642 
44284 



Init 51902 51959 
Term 52945 53244 



15 >4895176 



54124 53296 



25 Sngl 54143 53282 

>4895176 72445 



Term 
Intr 
Intr 
Intr 

35 Intr 
Intr 
Intr 
Intr 
Intr 

4 0 Intr 
Init 

>4895176 

45 len = 

Term 
Intr 
Intr 

50 Intr 
Intr 
Intr 
Intr 
Intr 

55 Intr 
Intr 
Intr 
Init 



2175 

65959 
66105 
66298 
66481 
66642 
66796 
66979 
67227 
67424 
67812 
68030 



nex = 

65856 
66059 
66203 
66382 
66582 
66725 
66899 
67159 
67338 
67723 
67896 



7912 
2890 nex ■■ 



65760 
65959 
66105 
66298 
66481 
66642 
66796 
66979 
67227 
67424 
67812 
68345 



65465 
65853 
66059 
66203 
66382 
66582 
56725 
66899 
67159 
67338 
67723 
67896 



60 >4895176 710987 
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len = 914 nex = 

Sngl 75026 75939 

5 

>4895176 /34976 

len = 98 5 nex = 

10 Sngl 84464 83479 

>4895213 /17709 

len = 215 nex = 

15 

Sngl 12033 11819 

>4895213 /19452 

20 len = 1181 nex = 

Term 12235 11861 

Intr 12448 12322 

Intr 12634 12559 

25 Init 13041 12797 

>4895213 /168 

len = 1128 nex = 

30 

Term 21760 21460 

Init 22587 22368 

>4895213 /3516B 

35 

len = 2411 nex = 

Init 69033 69478 

Intr 69674 69795 

40 Intr 69886 69976 

Intr 70067 70187 

Intr 70283 70356 

Term 70828 71443 

45 >4895233 /19714 

len = 1933 nex = 

Term 14019 13908 

50 Intr 14335 14258 

Intr 14918 14868 

Intr 15157 14999 

Intr 15354 15239 

Intr 15541 15460 

55 Init 15691 15621 

>4895233 76457 

len = 2356 nex = 

60 
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Term 
Intr 
intr 
Intr 
Intr 
Intr 
Intr 
Init 



14019 
14335 
14918 
15157 
15354 
15541 
15691 
16010 



13655 
14258 
14868 
14999 
15239 
15460 
15621 
15800 



>4895233 
len = 

>4895233 
len = 

>4895233 

len = 

Term 
Intr 
Intr 
Init 

>4914356 

len = 

Term 
Init 

>4914383 

len = 

Init 
Intr 
Intr 
Term 

>4914383 

len = 

Init 
Intr 
Intr 
Intr 
Term 

>4914399 

len = 



/19091 
1316 nex = 

/2018 
944 nex = 

/34688 

1882 nex = 

57501 57086 

57848 57580 

58122 58008 

58967 58722 

/19995 

1154 nex = 



25058 
25781 



24628 
25678 



/41120 

16 6 5 nex = 

69870 70209 

70434 70564 

70654 70875 

70961 71525 

722376 

112 2 nex = 



93930 
94116 
94386 
94572 
94741 



94034 
94290 
94491 
94661 
95051 



736778 
2278 nex 



Term 
Intr 
Intr 
Intr 



17584 17205 

17738 17692 

18153 18069 

18331 18253 
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Intr 
Intr 
Intr 
Init 

>4914399 
len = 
Sngl 

>4914400 

len = 

Term 
Intr 
Intr 
Intr 
Intr 
Intr 
Init 

>4914400 

len = 

Sngl 

>4914400 

Sngl 

>4914400 

len = 

Sngl 

>4914400 

len = 

Term 
Intr 
Intr 
Intr 
Init 

>4914400 

len = 

Sngl 

>4914422 



18578 18425 

18790 18711 

19120 18889 

19482 19368 

/97049 

537 nex = 

61982 61446 

/20435 

17 68 nex = 



20696 
20940 
21127 
21289 
21473 
21619 
22059 



20292 
20830 
21041 
21216 
21380 
21574 
21711 



/17491 

491 nex = 

25638 25148 

/32381 

674 nex = 

30579 31252 

/124761 

709 nex = 

52209 51501 

/117732 

1754 nex = 

85506 85180 

85754 85603 

86180 85944 

86678 86458 

86933 86763 

/42211 

630 nex = 

90161 89532 

/30671 



60 len = 



735 nex = 



2 
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Term 
Init 

>4914422 
Len = 
Sngl 

>4914422 

len = 

Init 
Intr 
Intr 
Term 

>4914422 

len = 

Init 
Intr 
Term 

>4914422 

len = 

Init 
Intr 
Term 

>4914422 

len = 

Init 
Intr 
Term 

>4914422 

len = 

Sngl 

>4914422 

len = 

Sngl 

>4914422 

len = 

Term 



11060 10810 
11544 11267 

/41723 

523 nex = 

16294 16816 

/15801 

1274 nex = 

40051 40158 

40255 40329 

40525 40659 

41043 41301 

/21466 

2006 nex = 

46972 47324 
47772 47952 
48182 48359 

728549 

1090 nex = 

50748 50839 
51200 51370 
51631 51834 

/147351 

109 9 nex = 

50774 50839 
51200 51370 
51631 51859 

/15569 

268 nex = 

56936 56669 

/152675 

463 nex = 

57037 56575 

/32703 

1041 nex = 

57059 56669 
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Intr 
Init 



57259 57112 
57709 57543 



>4914422 
len = 
Sngl 

>4914422 
len = 



/97001 
1020 nex = 
57709 56690 
/13439 
1019 nex = 



Term 
Init 



>4914422 
len = 



Term 
Init 



>4914422 
len = 



57259 56691 
57709 57543 



/27704 
989 nex 



57259 57112 
57709 57543 



/104060 
1030 nex 



Term 
Intr 
Init 

>4914422 

len = 

Term 
Init 

>4914422 

len = 

Init 
Term 

>4914422 

len = 



57059 56690 
57259 57112 
57710 57543 

732548 

1050 nex = 

57259 56665 
57714 57543 

/4740 

1222 nex = 

69466 69678 
69803 69921 

/14629 

730 nex = 



Init 
Term 



>4914422 
len = 



71747 71925 
72026 72475 



734593 
1780 nex ■■ 



Init 
Intr 
Intr 
Intr 



73912 74057 

74215 74328 

74412 74925 

75022 75175 
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Term 

>4914449 

len = 

Term 
rntr 
Init 

>4914449 

len = 

Term 
Intr 
Intr 
Intr 
Intr 
Init 

>4914454 

len = 

Init 
Intr 
Intr 
Term 

>4914454 

len = 

Term 
Init 

>4914454 

Sngl 

>4914454 

len = 

Sngl 

>4914454 

len = 

Init 
Intr 
Intr 
Intr 
Intr 
Intr 
Term 



75262 75691 

/33307 

1602 nex = 

9666 9504 
10587 10510 
10901 10672 

/2508 

2620 nex = 

32362 32080 

32797 32572 

33006 32910 

33341 33085 

33603 33434 

34699 34297 

/20902 

17 79 nex = 

14139 14543 

14915 15176 

15255 15322 

15419 15917 

/7447 

1873 nex = 

16739 15950 
17822 17623 

/19084 

339 nex = 

18789 19127 

/118265 

3 91 nex = 

33814 33429 

/11319 

2254 nex = 

4901 4989 

5232 5572 

5666 5761 

5846 5923 

6011 6187 

6602 6670 

6755 7154 
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len -■ 



2530 



nex ■■ 



Term 
Intr 
Intr 
Intr 
Intr 
Intr 
Intr 
Intr 
Intr 
Init 



16431 
16690 
16930 
17119 
17277 
17440 
17638 
17812 
17983 
18119 



15960 
16520 
16820 
17018 
17215 
17369 
17537 
17711 
17911 
18067 



>4938473 
len = 



Init 
Term 



34311 34428 
34806 35250 



>4938473 
len = 



2202 



nex 



Term 
Intr 
Intr 
Intr 
Init 



75630 75293 

75942 75702 

76292 76068 

76487 76363 

77494 77121 



/2891 





len = 


1229 




1 




Sngl 


9654 


10033 


+ 


40 


>4938473 


/125503 






len = 


234 


nex = 


1 




Sngl 


9689 


9922 


+ 


45 












>4938493 


/21702 






len = 


4165 


nex = 


13 


50 


Init 


13805 


13896 


+ 




Intr 


14185 


14254 


+ 




Intr 


14586 


14698 


+ 




Intr 


14905 


15001 


+ 




Intr 


15177 


15256 


+ 


55 


Intr 


15440 


15634 


+ 




Intr 


15720 


15803 


+ 




Intr 


15895 


15955 


+ 




Intr 


16076 


16136 


+ 




Intr 


16239 


16332 


+ 


60 


Intr 


16404 


16476 


+ 
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Intr 
Term 

>4938493 

len = 

Term 
Intr 
Intr 
Intr 
Intr 
Init 

>4938493 

len = 

Sngl 

>4972043 

len = 

Sngl 

>4972043 

len = 

Sngl 

>4972043 

len = 

Init 
Intr 
Intr 
Term 

>4972043 

len = 

Init 
Intr 
Term 

>4972043 

len = 

Term 
Intr 
Init 

>4972043 

len = 



16614 16696 
16814 17269 



/14713 
2 612 nex 



37708 
37918 
38169 
38398 
38544 
38752 



37490 
37806 
38036 
38290 
38485 
38670 



726276 

310 nex = 

80685 80993 

733435 

7 96 nex = 

16528 17323 

722677 

674 nex = 

18856 19529 

792148 

970 nex = 

25133 25274 

25355 25571 

25661 25801 

25910 26100 

72545 

2 009 nex = 

2771 3001 
3642 3942 
4073 4779 

741319 

1930 nex = 

35309 35007 
35674 35613 
36929 36399 

735742 

1947 nex = 
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Init 
Term 

>4972043 

len = 

Sngl 

>4972043 

Init 
Intr 
Intr 
Intr 
Intr 
Term 

>4972043 

len = 

Init 
Intr 
Intr 
Intr 
Term 

>4972043 

len = 

Init 
Intr 
Intr 
Intr 
Term 

>4972043 
len = 

>4972065 

len = 

Term 
Intr 
Init 

>4972065 

len = 



771 

738287 



578 
1995 



62463 63634 



1810 



nex ■ 



92440 92483 

92601 92697 

92948 93047 

93126 93211 

93299 93390 

93561 93675 



/4398 



1797 



nex 



92470 92597 

92948 93047 

93126 93211 

93299 93390 

93561 93675 

/93806 

1701 nex = 

92571 92697 

92948 93047 

93126 93211 

93299 93390 

93561 93675 

/119346 

647 nex = 

738344 

1428 nex = 

31960 31284 
32312 32044 
32711 32422 

734069 

1213 nex = 



Init 47383 47581 
Intr 47679 47884 
60 Term 48236 48595 
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>4972065 
len = 

5 

Sngl 

>4972065 

10 len = 

Init 
Intr 
Intr 

15 Term 
>4972065 
len = 

20 

Sngl 

>4972065 

25 len = 

Term 
Intr 
Intr 

3 0 Init 
>4972065 

len = 

35 Sngl 

>4972077 

len = 



40 



50 



Term 
Init 



Term 
Init 



>4972077 
len = 
55 Sngl 
>4972087 
len = 

60 



/10069 

434 nex = 

48305 48738 

/36741 

1705 nex = 

49180 49398 

49482 50010 

50089 50447 

50541 50884 

/101714 

294 nex = 

52352 52059 

/35207 

1280 nex = 

51432 51078 
51652 51532 
51983 51743 
52357 52061 
/126440 

391 nex = 

59542 59932 

/12759 

2470 nex = 

17826 17297 
18688 17900 

725532 

2549 nex = 

17826 17288 
18688 17900 

/116606 

4 90 nex = 

58084 58567 

738343 

817 nex = 
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Sngl 23707 24523 

>4972087 /28506 

5 len = 1074 nex = 

Init 28873 29020 

Intr 29372 29517 

Term 29781 29946 

10 

>4972087 /30708 

len = 941 nex = 

15 Sngl 30856 29916 

>4972087 /38500 

len = 2072 nex = 

20 

Init 33934 34065 

Intr 34547 34684 

Intr 34962 34999 

Intr 35096 35146 

25 Intr 35230 35397 

Intr 35468 35580 

Term 35674 36005 

>4972087 /115901 

30 

len = 1524 nex = 

Term 8097 8047 

Init 9570 9200 

35 

>4996901 72496 

len = 1366 nex = 

40 Init 15599 15705 

Intr 15871 15982 

Term 16118 16964 

>4996901 /35390 

45 

len = 2432 nex = 

Term 872 613 

Intr 1483 1364 

50 Intr 1822 1718 

Intr 2438 2370 

Init 3044 2539 

>4996901 /15624 

55 

len = 19 6 0 nex = 

Init 55525 55641 

Intr 55761 55942 

60 Intr 56052 56159 
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Intr 
Intr 
Intr 
Term 



56321 55410 

56669 56753 

57080 57174 

57264 57484 



/11543 



Init 


78466 


78573 


+ 


0 


Intr 


78960 


79043 


+ 


0 


Intr 


79136 


79248 


+ 


0 


Intr 


79343 


79556 


+ 


0 


Intr 


79691 


79764 


+ 


0 


Intr 


79859 


79912 


+ 


0 


Intr 


80008 


80115 


+ 


0 


Intr 


80204 


80262 


+ 


0 


Intr 


80494 


80704 


+ 


0 


Term 


80794 


81047 


+ 


0 



25 Term 
Init 

>4996901 

3 0 len = 

Init 
Intr 
Intr 

3 5 Term 
>4996901 
len = 

40 

Init 
Intr 
Term 

45 >4996901 
Term 

50 Init 
>4996902 
len = 



81166 
81526 81238 

736526 

1570 nex = 

81684 81730 

81908 82053 

82294 82409 

82501 83245 

737444 

169 6 nex = 

83424 83855 
84133 84410 
84602 85119 

730264 

1299 nex = 

91084 90551 
91849 91557 

727688 

8 79 nex = 



Term 
Init 



61399 61074 
61655 61591 



>4996902 

60 



7142223 
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len = 
Sngl 
>4996902 
len = 



2719 nex ■■ 



Term 
Intr 
Intr 
Intr 
Init 



84065 83542 

84858 84288 

85264 84965 

85618 85341 

86260 85915 



15 >4996903 



/21835 



>5002514 
len = 



2193 nex 



Term 
Intr 
Intr 
Init 



60521 
60800 
61272 
62181 



59989 
60607 
60875 
61788 



3101 nex - 



Term 
Intr 
Intr 
Intr 
Intr 
Intr 
Intr 
Intr 
Intr 
Intr 
Intr 
Init 



22954 
23196 
23432 
23601 
23740 
23895 
24045 
24285 
24518 
24733 
24951 
25666 



22566 
23047 
23292 
23514 
23685 
23820 
23972 
24127 
24375 
24601 
24845 
25475 



Term 
Init 



34112 33856 
34871 34579 



>5002514 
len = 



4330 



nex 



Term 
Intr 
Intr 
Intr 
Intr 
Intr 
Intr 
Intr 
Intr 



47968 
48454 
48638 
48846 
49060 
49228 
49372 
49764 
50625 



47811 
48334 
48589 
48733 
48980 
49147 
49318 
49534 
49920 
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Intr 
Intr 
Intr 
Init 

5 

>5002514 

len = 

10 Init 
Intr 
Intr 
Term 

15 >5019262 
len = 
Term 

2 0 Init 

>5019264 
len = 

25 

Sngl 
>5019264 

3 0 len = 

Sngl 
>5019264 

35 

len = 
Sngl 

40 >5019265 
len = 



45 



Term 
Intr 
Intr 
Intr 
Init 



50 >5041959 
len = 
Sngl 

55 

>5041960 
len = 
6 0 Sngl 



51051 50964 

51239 51133 

51447 51329 

52137 51799 

/15314 

1710 nex = 

69964 70199 

70466 70639 

70727 70894 

71131 71673 

793927 

983 nex = 

59395 58861 
59843 59533 

/110927 

415 nex = 

4369 4766 

/122052 

370 nex = 

4404 4766 

/23133 

743 nex = 

98229 97508 

72686 

1280 nex = 



83 



1 



243 167 

604 337 

923 718 

1280 1010 

7117665 

7 78 nex = 

29026 29803 

739877 

1241 nex = 

33119 34355 
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>5041962 
len = 



/32146 
2478 nex 



Init 
Intr 
Intr 
Intr 
Intr 
Intr 
Term 



27173 27393 

27737 27765 

28119 28208 

28281 28520 

28603 28800 

28954 29072 

29160 29650 



>5041962 
len = 



737936 
1330 nex 



Init 
Intr 
Intr 
Term 



36213 36536 

36817 36933 

37029 37116 

37380 37537 



>5041962 
len = 



/28093 
1450 nex = 



Init 
Intr 
Intr 
Term 

>5041962 

len = 

Init 
Term 



36215 36536 

36817 36933 

37029 37116 

37380 37657 

/109138 

692 nex = 

36222 36536 

36817 36901 



>5041968 
len = 



792527 
1055 nex 



Term 
Init 



>5051726 
len = 



39971 39857 
40467 40315 



75364 
1619 nex 



Init 
Intr 
Term 



106365 106639 
106745 106850 
107554 107983 



>5051726 
len = 



719178 
1333 nex 



Init 18384 18450 
Intr 18595 18622 
60 Intr 19014 19098 
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Intr 
Term 

>5051726 

len = 

Init 
Intr 
Intr 
Term 

>5051726 

len = 

Sngl 



19213 19302 
19395 19716 

/31050 

1255 nex = 

18401 18450 

19014 19098 

19213 19302 

19395 19642 

/5398 

761 nex = 

9718 8958 
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CLAIMS 

What is claimed is: 



1 . An isolated nucleic acid molecule comprising a nucleic acid having a 
nucleotide sequence which encodes an amino acid sequence exhibiting at least 40% sequence 
identity to an amino acid sequence encoded by 

(a) a nucleotide sequence described in Table 1 or a fragment thereof; or 

(b) a complement of a nucleotide sequence shown in Table 1 or a fragment 
thereof. 

2. An isolated nucleic acid molecule comprising a nucleic acid having a 
nucleotide sequence which exhibits at least 65% sequence identity to 

(a) a nucleotide sequence shown in Table 1 or a fragment thereof; or 

(b) a complement of a nucleotide sequence described in Table 1 or a fragment 
5 thereof. 

3. An isolated nucleic acid molecule comprising a nucleic acid having a 
nucleotide sequence which exhibits at least 65% sequence identity to a gene comprising 

(a) a nucleotide sequence shown in Table 1 or a fragment thereof; or 

(b) a complement of a nucleotide sequence described in Table 1 or a fragment 
5 thereof. 

4. An isolated nucleic acid molecule which is the reverse of the isolated 
nucleotide sequence according to claim 1 , such that the reverse nucleotide sequence has a 
sequence order which is the reverse of the sequence order of said isolated nucleotide 
sequence according to claim 1 . 

5. An isolated nucleic acid molecule comprising a nucleic acid capable of 
hybridizing to a nucleic acid having a sequence selected from the group consisting of: 

(a) a nucleotide sequence which is shovm in Table 1 ; and 

(b) a nucleotide sequence which is complementary to a nucleotide sequence 
5 shown in Table 1 ; 

imder conditions that permit formation of a nucleic acid duplex at a temperature from about 
40°C and 48°C below the melting temperature of the nucleic acid duplex. 

6. The nucleic acid molecule according to claim 1, wherein said nucleic acid 
comprises an open reading frame. 
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7. The isolated nucleic acid molecule of claim 1 , wherein said nucleic acid is 
capable of functioning as a promoter, a 3' end termination sequence, an untranslated region 
(UTR), or as a regulatory sequence. 

8. The isolated nucleic acid molecule of claim 7, wherein said nucleic acid is a 
promoter and comprises a sequence selected from the group consisting of a TATA box 
sequence, a CAAT box sequence, a motif of GCAATCG or any transcription-factor binding 
sequence, and any combination thereof. 

9. The isolated nucleic acid molecule of claim 7, wherein the nucleic acid 
sequence is a regulatory sequence which is capable of promoting seed-specific expression, 
embryo-specific expression, ovule-specific expression, tapetum-specific expression or root- 
specific expression of a sequence or any combination thereof. 

10. A vector construct comprising a nucleic acid molecule according to claim 1 , 
wherein said nucleic acid molecule is heterologous to any element in said vector construct. 

11. A vector construct comprising: 

(a) a first nucleic acid having a regulatory sequence capable of causing 
transcription and/or translation; and 

(b) a second nucleic acid having the sequence of the isolated nucleic acid 
5 molecule according to claim 1; 

wherein said first and second nucleic acids are operably linked and 

wherein said second nucleic acid is heterologous to any element in said vector construct. 

12. The vector construct according to claim 11, wherein said first nucleic acid is 
native to said second nucleic acid. 

1 3 . The vector construct according to claim 1 1 , wherein said first nucleic acid is 
heterologous to said second nucleic acid. 

14. A vector construct comprising: 

(c) a first nucleic acid having the sequence of the isolated nucleic acid 
molecule according to claim 7; and 

(d) a second nucleic acid; 

5 wherein said first and second nucleic acids are operably linked and 

wherein said first nucleic acid is heterologous to any element in said vector construct. 

15. The vector construct according to claim 14, wherein said first nucleic acid is 
native to said second nucleic acid. 

16. The vector construct according to claim 14, wherein said first nucleic acid is 
heterologous to said second nucleic acid. 
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17. A host cell comprising an isolated nucleic acid molecule according to claim 1, 
wherein said nucleic acid molecule is flanked by exogenous sequence. 

18. A host cell comprising a vector construct of claim 10. 

19. A host cell comprising a vector construct of claim 1 1 . 

20. A host cell comprising a vector construct of claim 12. 

21. A host cell comprising a vector construct of claim 1 3 . 

22. A host cell comprising a vector construct of claim 14. 

23. A host cell comprising a vector construct of claim 15. 

24. A host cell comprising a vector construct of claim, 16. 

25. An isolated polypeptide comprising an amino acid sequence 

(a) exhibiting at least 40% sequence identity of an amino acid sequence encoded 
by a sequence shown in Table 1 or a fragment thereof; and 

(b) capable of exhibiting at least one of the biological activities of the 
polypeptide encoded by said nucleotide sequence shown in Table 1 or a 
fragment thereof. 

26. The isolated polypeptide of claim 25, wherein said amino acid sequence 
exhibits at least 75% sequence identity to an amino acid sequence encoded by a 
sequence shown in Table 1 or a fragment thereof. 

27. The isolated polypeptide of claim 25, wherein said amino acid sequence 
exhibits at least 85% sequence identity to an amino acid sequence encoded by a 
sequence shown in Table 1 or a fragment thereof. 

28. The isolated polypeptide of claim 25, wherein said amino acid sequence 
exhibits at least 90% sequence identity to an amino acid sequence encoded by a 
sequence shown in Table 1 or a fragment thereof. 

29. An antibody capable of binding the isolated polypeptide of claim 25. 

30. A method of introducing an isolated nucleic acid into a host cell comprising: 

(a) providing an isolated nucleic acid molecule according to claim 1 ; and 

(b) contacting said isolated nucleic with said host cell imder conditions that 
permit insertion of said nucleic acid into said host cell. 

31. A method of transforming a host cell which comprises contacting a host cell 
with a vector construct according to claim 10. 

32. A method of transforming a host cell which comprises contacting a host cell 
with a vector construct according to claim 1 1 . 
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33. A method of transforming a host cell which comprises contacting a host cell 
with a vector construct according to claim 12. 

34. A method of transforming a host cell which comprises contacting a host cell 
with a vector construct according to claim 13. 

35. A method of transforming a host cell which comprises contacting a host cell 
with a vector construct according to claim 14. 

36. A method of transforming a host cell which comprises contacting a host cell 
with a vector construct according to claim 15. 

37. A method of transforming a host cell which comprises contacting a host cell 
with a vector construct according to claim 16. 

38. A method of modulating transcription and/or translation of a nucleic acid in a 
host cell comprising: 

(a) providing the host cell of claim 17; and 

(b) culturing said host cell under conditions that permit transcription or 
translation. 

39. A method for detecting a nucleic acid in a sample which comprises: 

(a) providing an isolated nucleic acid molecule according to claim 1 ; 

(b) contacting said isolated nucleic acid molecule with a sample under 
conditions which permit a comparison of the sequence of said isolated 
nucleic acid molecule with the sequence of DNA in said sample; and 

(c) analyzing the result of said comparison. 

40. The method according to claim 39, wherein said isolated nucleic acid 
molecule and said sample are contacted under conditions which permit the formation 
of a duplex between complementary nucleic acid sequences. 

41. A plant or ceil of a plant which comprises a nucleic acid molecule according 
to claim 1 which is exogenous to said plant or plant cell. 

42. A plant or cell of a plant which comprises a nucleic acid molecule according 
to claim 1, wherein said nucleic acid molecule is heterologous to said plant or said 
cell of a plant. 

43 . A plant or cell of a plant which has been transformed with a nucleic acid 
molecule according to claim 1 . 

44. A plant or cell of a plant which comprises a vector construct according to 
claim 10. 
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45. A plant or cell of a plant which has been transformed with a vector construct 
according to claim 1 0. 

46. A plant which has been regenerated from a plant cell according to claim 41 . 

47. A plant which has been regenerated from a plant cell according to claim 42. 

48. A plant which has been regenerated from a plant cell according to claim 43. 

49. A plant which has been regenerated from a plant cell according to claim 44. 

50. A plant which has been regenerated from a plant cell according to claim 45. 
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ABSTRACT OF THE DISCLOSURE 

The present invention provides DNA molecules that constitute fragments of the 
genome of a plant, and polypeptides encoded thereby. The DNA molecules are useful for 
specifying a gene product in cells, either as a promoter or as a protein coding sequence or as 
5 an UTR or as a 3' termination sequence, and are also useful in controlling the behavior of a 
gene in the chromosome, in controlling the expression of a gene or as tools for genetic 
mapping, recognizing or isolating identical or related DNA fragments, or identification of a 
particular individual organism, or for clustering of a group of organisms with a common 
trait. 
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POISER OF ATTORNEY 



CEHES, INC. 



3007 Halibu Can^fon Eoad 
Maliiu, CA 90265 - 



I, Richard Hamilton, Cliie£ Pinancial Officer of CESES, 3reiC. 
of 3007 Malibu Canyon Road, Kalibu, California 90265, grant 
Bower of Attorney and a^zthori-ty to en^jower the following 
attomaya to act on behalf of CERES, INC. for exectating Verified 
Statements (Declarations) Claiming Small Entity Status to - be 
submitted to the U.S. Patent and Trademark Office in connection 
vith the filing of provisional or regular patent applications on 
behalf of CEEES, INC. 



Raymond C. Stewart (Reg. No. 21,066) 
Joseph A- Rolasdh (Reg. Ko. 22,463} 
I>ooiiard R. Svensson (Reg. Ko. 30,330) 
Gerald M. Moxphy, Jr. (Reg. No. 28^977) 
Hark J. Nuell (Reg. No. 3 6, €23} 



This Power of Attorney is to remain in full force and 
effect until terminated lay an official of CERES, INC. 




Richard 'Eamilton 




